


India*'* A«Ri«'aLTUBAL 
Pb«earch iNSTiTriE, New Delhi 

j Vw 

-v * 


La.R 1-6. 

OIP NLtC—H 3 -to-s 55—IS.OOO 




JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


VOLLTkIE 44: 1949 

NUMBERS 246-248 


48046 

iilliiilWIlWIIII 


lARI 


VxB C Vt VO 

*» 

PttbUshed Qmrterly by the 

AMERICAN STATISTICAL ASSOCIATION 
WASHINGTON, D. O. 

1949 




JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


VOLUMB 44 


Mabgh 1949 


Numbbb 245 


ARTICLES 

On a Unique Feature of Statistics. Gbobgb W. Snbbeoob 1 

An Attempt to Get the *Not at Homes” into the Sample without Callbacks 

.Alfbed Polttz and Wzllabd Simmons 9 

Application of Least Squares Regression to Relationships Containing 

Auto-Correlated Error Terms . . D. Cochbanb and G. H. Obcutt 32 

AOQL Single Sampling Plans from a Single Chart and Table. 

.Donald J. Gbeb and Julio N. Bbbbettoni 62 

On Measuring Languages. Stuabt C. Dodd 77 

Confidence Limits in the Non-Parametric Case . Gottfried E. Noether 89 

On a Method of Estimating Birth and Death Rates and the Extent of 

Registration . . . C. Chandra Seeab and W. Edwards Deming 101 

Evaluation of Parameters in the Gompertz and Makeham Equations . 

.J. F. Brennan 116 

On the “Information” Lost by Using a f-Test When the Population Variance 

Is Known. ... John E. Walsh 122 

Wesley Clair Mitchell, 1874-1948. An Appreciation . . Simon Kuznets 126 

BOOK REVIEWS 

British Standards Institution, FraeHon^DtfecUve Charts . 

.Albert H. Bowkeb 132 

Abbott, J. C., and Benag, T. J., Principles of Counting and, ProbdbUUy , 

.Herbert Solomon 133 

Bratt, Elmer Clark, Business Cycles and Forecasting^ Third Edition . 

.William A. Sfurr 134 

Churchman, C. West, Theory of Experimental Inference . John W. Tuket 136 

Enrick, Norbbrt L., Quality Control: A Manual of Quality Control Pro^ 
eedure Based Upon Scientific Principles and Simplified for Practical 
Application in various Types of Manufacturing Plants . J. H. Curtiss 139 

.E. H. MacNiecb 142 

Gbeenshibldb, Bruce D., Traffic Performance at Urban Street Intersec-* 

tions .Harry G. Romig 142 

Hendricks, Waiter A., Mathematics of Sampling . . T. A. Bancroft 144 

iMpsiiU \ rr'finltural Besearch iBStSwIS* 

New Delhi. 

















HilIi, a. Bbadfobs, Prineiplea of Medical SUUistics, Fourth Edition . 

.Mabgabbt Mabtin 146 

Kbbbich, J. E., An Experimental Introduction to the Theory of Probability . 

.J. F. Kbnnbt 147 

Mainland, Donald, Staiistical Methods in Medical Research: I, Qualitative 

Statistics {Enumeration Data) . John W. Fbbtio 148 

A. Bbadfobd Hill 149 

Eashbvset, N., Mathematical Theory of Human Relations: An Approach to 

a Mamematical Biology of Sodm Phenomenon . 

.Fbedebice Mostelleb 150 


A cumiilative Index to Volumes 1-34,1888-1939, and AnnuiJ Indexes thereafter, 

Office of the Secretary of the AMERICAN STA¬ 
TISTICAL ASSOCIATIOK 








JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Number $4^ MARCH 1949 Volume 44 

ON A UNIQUE FEATURE OF STATISTICS* 

Geobgb W. Snedecob 
Professor of Statistics, Iowa StoOe College 

I N UNDERTAKING the presidency of the American Statistical Associa¬ 
tion, my chief purpose was to do my part toward raising the stand¬ 
ards of our profession. I soon found it necessary to clarify my ideas about 
the nature of those standards I hox)ed to raise. Statistics is a sprawling 
subject covering loosely the collection of observational data, the sum¬ 
marization of these data, the drawing of conclusions based upon them, 
and pertinent mathematical theory. These processes and theories are 
the common property of many disciplines. Is there any unique feature 
that distinguishes the professional statistician from his fellows? If so, 
it should be the foimdation on which standards are set up. 

It is immediately clear that if there is anything that characterizes 
the professional statistician this thing changes in time. The earliest 
preoccupation of statisticians was with military and economic affairs of 
the state—human and material resources for making war, and the 
spoils of a successful campaign. Much later came a long period in 
which statisticians, in the words of R. A. Fisher, "appear to have had 
no other aim than to ascertain aggregate or average values.” During 
this period, the theory of probability was extensively developed but its 
impact on statistical thinking was somewhat superficial. The present 
era of statistics is characterized by the emphasis on variation, notably 
sampling variation. Variation is interesting not only in itself as a well- 
nigh universal phenomenon, but more especially as one source of the 
uncertainty in all inductive reasoning. It seems that in our own time, 
the professional statistician’s peculiar function is to develop and publi¬ 
cize the implications of variation. The future I do not pretend to know; 
but the consequences of'variation, in the^broad sense of uncertain in- 


* Frefiodential address delivered at the 108th Annual Meefins of the Ammcta. Statistiesl Associa¬ 
tion on Deoember 28.1948. 

1 

MnHage# Ubnaj. 

. iltnral R«sear«li IdbMc, 




2 AMBRIOAir STATISTICAL ASSOCIATIOK JOTIBNAL, UASOH IMA 

feience do not seem to have been entirely worked out. It is reasonable 
to believe, then, that both at present and in the foreseeable future, the 
professional statistician’s most useful contribution to science is in the 
theory and practice of uncertain inference. Dr. Flood has put it rather 
a trikingly in this fashion: “The profestional statistician reduces data to 
numerical form and uses them to measure the fallibility of a conclusion 
where fallibility is estimated exclusively from the data in hand.” 

The last clause of this definition should be emphasized in advance of 
further discussion. All scientists make judgments about fallibility. 
They scrutinize tbeir data with care; conclusions are checked i^mnst 
the theories prevailing in their fields; if there is a reasonable doubt 
about the condumon, additional data are collected. It is only after the 
investigator has satisfied himself, with some high degree of assurance, 
that his condurions are valid that he releases them for the critical 
observation of his colleagues. The distinction between the profesdonal 
sf atistidan and his fellow scientists is that the statistician evaluates the 
uncertainty of the condusion by use of the data themsdves, the evalua¬ 
tion being in the form of an exact statement of probability. 

Two other facts should be observed. The first is that sdentists are 
plagued with many variables beside those that can be reduced to meas¬ 
urement by the laws of chance. Inaccuracies of various kinds may 
creep in; it is only lack of preddon, ordinarily numeralized in experi¬ 
mental or sampling error, that can (throu^ appropriate conduct of the 
investigation) be measured. The inaccurades may invalidate the con- 
dudon despite the fact that the statistical measure of fallibility indi¬ 
cates a high degree of confidence in it. From this, one may deci^ that 
the contribution of statistics, in the restrictive sense in which I am 
using the term, is of minor or even negligible utility. In what follows, 
I hope to show that such is not the case, at least so far as my experience 
and observation are evidential. 

The second fact to be observed is that the profesdonal statistician 
and the investigator in economics, biology, engineering, etc. are usually 
the same person. It is merely for convenience that I mention them 
separately. Researchers in many fidds have seized upon the sta¬ 
tistical devices for measuring uncertainty, so that I indude them in 
my definition of profesdonal statistician. My theds is that the char¬ 
acteristic which distinguishes the present-day profesdonal statis¬ 
tician is his interest and skill in the measurement of the fallibility of 
condudons. 

The layman, I am sure, would be surprised by sudi a statement. He 
is accustomed to the trappings of statistics rather than to the essence 



A tJNIQTTE FEATUBE OF STATISTICS 


3 


of it. To him statistics is S 3 nnboIized by long roTira of tedious figures 
and thdr display in tables and charts. Even job specifications for stat¬ 
isticians are commonly limited to the arithmetical processes of calcu- 
latii]^ averages, correlation coefiicients, trends and probable errors. 
These are all useful procedures but they can be carried on as well by a 
derk as by the profesdonal—often, indeed, better. The professional 
statistidan, whatever his other necessary qualifications, would seem 
to be set off from the layman by his habitual awareness of the fdlibility 
of concludons based upon data. It is not my purpose to advance further 
ai^uments for the thesis that statistics has this unique function, but 
merely to assume that it has and to discuss some of the consequences. 

First, let us look at the fidd of experimental sdence. In the familiar 
sequence of the sdentific method—^hypothesis, experiment, conclusion 
—^the part which is peculiarly statistical is the condusion. This involves 
a judgment about the fallibility of that"... logically hazardous proc¬ 
ess—the process of generalizmg from particular results.” I am quoting 
from Mood’s introduction of his “Theory of Statistics.” “The broad 
problem of statistical inference is to provide measures of the uncer¬ 
tainty of conclusions drawn from experimental data.” One hi^y de- 
vdoped measure of uncertainty is the statistical test of hypotheds; 
this exemplifies the statistical part of the sdentific method. 

Were the professional statistician to take no further interest in the 
procedures of the sdentific method he would be fulfilling his essential 
share in them by evaluatii^ fallibility, but he would fall far short of 
realizing his full usefulness to his fellow sdentists. It is not until late 
in the sequence that the statistical part of the method finds its place. 
At this stage it is often discovered that faults in the dedgn of the ex¬ 
periment make the measurement of fallibility either unnecessarily diffi¬ 
cult or wholly impossible; it is not unusual to find too late that the 
quantity of data furnished by the experiment is inadequate to detect the 
effects in question; and it may be evident that the experiment by 
change of design could have been made more sensitive, with consequent 
saving of effort and money. This means that the professional statis¬ 
tician may not only facilitate his own job at the end but may increase 
the effidency of the experiment by modifying its design and may, in¬ 
deed, rescue it from failure by ^timatirxg its required dze; all this by 
antidpating his own peculiar part in the condusion. To me it is aston¬ 
ishing as well as gratifying that the professional statistician, in order to 
perform effectively his atnall but indispensable part in the scientific 
method, has been impelled to inspect the whole structure and has 
brought about substantial strengthening in many of its members. 



4 AMEBICAN STATISTICAL ASSOCIATIOK JOUBKAL, UABCH 1949 

T urning next to the field of science in which the survey instead of 
the experiment is the device used, we find statistics occupying some¬ 
what the same unique porition. The objective of the survey may be 
eiiher to get information about some hypothesis or to estimate one or 
more parameters of the population. A survey is planned and executed 
in order to get the necessary evidence. The conclusion, in so far as it 
is based on the data, is inductive in nature and is subject to uncer¬ 
tainty. It may be looked upon as the professional statistician’s particu¬ 
lar buriness to evaluate this uncertainty. 

As in experimental science, the profesrional statistician can enhance 
his usefulness by helping with the derign of the survey. He can recom¬ 
mend designs that will furnish appropriate estimates of both position 
and scale; he can help choose the derign that wiU be as efficient as is 
profitable; and he can specify the rise of the sample that, with a desig¬ 
nated probability, may be expected to yield a satisfactorily small 
measure of the fallibility of the conelurion regardless of what this con¬ 
clusion may turn out to be. 

But in this branch of science, the profesrional statistician is called 
upon to make more extensive contributions than those required in ex¬ 
perimentation. Not only must he concern himself with precision but 
more especially with conditions affecting accuracy. So far as I can 
judge, the majority of the surveys now in operation have sources of 
inaccuracy not amenable to measurement by means of the data ob¬ 
tained. Restriction of the sampling to regions non-randomly chosen 
and the purposive selection of respondents are inherent causes of in¬ 
accuracy in the usual type of quota sampling. These causes easily 
could be remedied. But the professional statistician cannot stop when 
the relatively simple procedures of sampling are improved. He will then 
have to jcmi other scientists in their attack on the really tough prob¬ 
lems of schedule construction, selection and training of interviewers, 
and the little understood relations between interviewer and respondent. 
Althou^ these problems are psychologic, social and economic, they 
affect the probabilities which control the measurement of uncertainty 
and therefore fall within the purview of the professional statistician’s 
inters. 

Why do professional statisticians in surve 3 rs have graver responsibili¬ 
ties than those in the experimental sciences? One reason is that the 
experiment is an older and better developed instrument than the survey, 
requiring less extenrive and less obvious improvements. Another is 
that in the main experimenters are better trained for their work than 
are surveyors, and are heirs to a tradition of severe sdf-disdpline. 



A TJNIQtJE FBATDBE OF STATISTICS 


5 


Operators of surveys are only beginning to feel tbe need for eyaTniTiing 
their procedures: the embarrassment of the pollers in the recent election 
will, I hope, emphasize the necessity for higher standards among all 
samplers. If not, increasing loss of confidence by the public is sure to 
ensue. A third reason for the heavier responsibilities of professional 
statisticians in surveys is that controls are more difficult for investi¬ 
gators who work with human material —homo sapiens is notoriously 
a difficult experimental animal. It may be srears or even decades before 
the professional statistician’s part in the survey becomes so specialized 
as it now is in the experiment. 

Reverting next to my openii^ theme, it seems obvious to me that in 
assessing professional standing in statistics, expertness in evalualai^ 
the fallibility of conclusions should play a major role. In sasong this, I 
am not ignoring the fact that most users of statistics will have little 
interest in qualifying as specialists in so narrow a branch of the subject. 
Statisticians (who may or may not rate as professionals) have astonish¬ 
ingly varied activities. The collection of data, the planning that pre¬ 
cedes this collection, the summarizing processes that follow, the inter¬ 
pretation and reporting of the results— these are preoccupations of 
thousands of us. Other thousands, doubtless the majority, have only an 
incidental interest in professional statistics, thrir primary objectives 
being in the fidd of application—economics, industry, medicine, and 
dozens more: these usually are included in the fold of statistics because, 
in their own subject matter fields, they base their investigations on 
observational data. But all who use statistics have this in common: 
they are working toward conclusions based on more or less incomplete 
enumeration, conclusions that have the uncertainty of all induction. 
So, they are all concerned with the evaluation of uncertainty whether 
or not they specialize in this unique feature. I believe that every statis¬ 
tician will be more valuable in his own area if he clearly apprehends this 
universal characteristic of his material and that his professional com¬ 
petence will increase with his expertness in evaluatiig the fallibility of 
conclusions based upon such material. 

It is plain that I make a clear distinction between profesrional statis¬ 
ticians and statisticians in fields using statistics as a tool, statistidans 
who may have no proficiency in measuring uncertainty. These latter 
must of necessity make jud^ents about risk, but they often do this 
successfully without actual evaluation. They may be top-ffi^t scien¬ 
tists or administrators but may never subject their probabilities os- 
plicitly to measurement. It is the measurement of the uncertainties of 
condusioDS that distinguishes the professicnial statistician and which 



6 AMBBICAIT STATISTICAIi ASSOCIATION JOITItNAIi, IfARCH IMS 

mak^ hitn useful to other professionals in the various fields of applica* 
tion. 

Professional statisticians may or may not be mathematicians. The 
more mathematics the better, but it is not ^sential. Of course, the 
mathematical statistician must develop the techniques of measurement 
and must carefully describe the conditions of their applicability. If 
unusual conditions are met, he must be called upon to devise appropri¬ 
ate new techniques. The non-mathenoatical profesdonal statistician 
must gain experience in the subject-matter fields. He cannot assuredly 
evaluate the uncertainty of concluaons unless he is intimately ac¬ 
quainted with the uncertainties in the data which he uses for his meas¬ 
urements. 

To some, on first thought, it may appear that I am si^esting un¬ 
necessarily rigorous standards. After more careful consideration they 
will i^iee, I thinlr that this is not so. There is nothing essentially diffi¬ 
cult in the idea of variation and its consequences; I have found that 
students in a first course in statistics easily grasp the concepts. The idea 
is certsdnly not new thou^ it has received increasing emphasis during 
the last thirty or forty years. Actually I am making the modest sugges¬ 
tion that professional attainment in statistics be gauged by attitudes 
toward statistical thinking of the present rather than of the past. It 
seems to me that up-to-dateness is a minimum standard for profes- 
raonalism in any field. In my thinking, standards in professional sta¬ 
tistics must be based, at least in part, on modem developments in the 
subject; they must include not only proficiency but preoccupation in 
the measurement of uncertainty. 

Let us now consider the application of my thesis to education. Funda¬ 
mentally I am a teacher. I think my chief contribution to statistics is 
the training of hundreds of budding scientists in the straight and nar¬ 
row way of uncertain inference. Until recently my field has been a nar¬ 
row one, limited almost entirely to the statistics of biological experi¬ 
mentation. Only during the last year have I begun the development of 
a course in elementary statistics with broad cultural objectives. This 
field seemingly is without limits. The unique feature of statistics, the 
evaluation of risk, is part of the daily and hourly living of every one of 
us. Uncertainty envelops us, and success or failure in life is the summa¬ 
tion of myriads of decisions as to which is the least hazardous course. 
It would seem that one or more courses in statistics would be part of 
every student’s training: but, as Dr. Walker said in her presidential 
address of 1944, “I have never heard of a liberal arts college that under¬ 
took to explmn to its students the stochastic nature of the universe in 



A UNIQITEi FEATT7KE OF STATISTICS 


7 


which they live and move and have their being. ” Why is this? I fear it is 
because we, as teachers, have failed to make the subject vital and con¬ 
vincing. 1 am afraid we have emphasized the calculational and grajAi- 
ical devices rather than the essential nature of the subject. Instead of 
devoting our energies to bringing the student into harmony with his 
physical, biological and economic environment of variation and un¬ 
certainty, we have bored him with another course in arithmetic and 
algebra from which meaning is largely omitted. The nature of decirions 
based on probability, experience in sampling together with the con¬ 
comitant risk in drawing conclusions, the fundamentals of our great 
cooperation in insurance, the social implications of betting—these are a 
few of the numerous facts of life that should form the structure of our 
courses in statistics. I believe that if, during the past fifty years, a 
realistic, living statistics had been taught, the subject would now be 
considered indispensable by most of our college administrators. 

1 think it an auspicious sign that a section on business statistics is 
being considered during this annual meeting. Business and statistics are 
blood brothers in that risk is basic in both. Yet most of our instruction 
in business statistics either ignores this common heritage or touches 
upon it vaguely in a chapter on sampling tucked away in the latter 
part of the book. About the only risk the student seems to be made con¬ 
scious of is the risk of a mistake in arithmetic. Is it too much to hope 
that some of our forward-looking business executives take the initiative 
in advocating the elimination of unrealistic courses in statistics from 
OUT curricula and the substitution of functional courses in their stead? 
I judge this could be done in half a dozen years by an enei^etic organ¬ 
ization with adequate resources. After all, business is a consumer of 
tile college product and is in a commanding position to insist on quality 
control of the output. 

The lack of profesrional standards for teadiers of statistics in our 
colleges and univerrities is an astonishing feature of our times. In 
fields other than statistics, even an instructor is often required to have 
the doctor’s degree in the subject matter of the department; the 
bachelor’s degree in an almost universal minimum requirement. Yet 
how many teachers of statistics have been graduated from a curriculum 
in statistics? It apparently never occurs to the head of a department of 
education, for example, to ask a prospective teacher of statistics about 
his degrees in statistics. Most of us are graduates of sudi departments 
as economics, mathematics, buriness or p^dhology. Our academic 
training in statistics may have been no more extensive than the courses 
we now teach. Personally, we are not to blame for this because in our 



8 AMEBICAN STATIsnCAIi ASSOCIATION JOTTBNAI., MARCH 

generations there were no curricula in statistics. Even at present they 
are distressingly few. But we as teachers should be aggressively dissatis¬ 
fied with such a condition. We should work for the establishment of 
departments of statistics and should each strive to improve his own 
professional standing. We should resolve that the next generation of 
teachers pHrII have advantages not available to us. Our Section on the 
Tr aining of Statisticians has the glorious opportunity of leading in this 
high endeavor. 

What are the implications of my thesis for the American Statistical 
Association? Under our new constitution we have abandoned our pre¬ 
occupation with any one subject-matter field and have volunteered our 
services as a focus of statistics among them all. In this capacity, we 
were asked to participate in the work of the Hoover Commission, sug¬ 
gesting dearable reorganization in the statistical agencies of the gov¬ 
ernment. We are joining the Social Science Research Council in review¬ 
ing the recent election predictions of three of the most prominent poll¬ 
ing oiganizations. Two government bureaus have asked criticism and 
suggestions for their programs. In meeting such responsibilities, our 
commission on Statistical Standards and Organizations seems destined 
to wield a powerful influence in statistical affairs. These things can be 
done only because we have in our membership professional statisticians 
of the highest competence. To maintain and enhuge tins leadership, we 
must attract to our ranks other able statisticians from all subject- 
matter fields, profesdonal statisticians whose skills include that unique 
function of modem statistics, the measurement of the uncertainty of 
inductive conclumons. 



AN ATTEMPT TO GET THE “NOT ATrHOMES^ 

INTO THE SAMPLE WITHOUT CALLBACKS 

Alfbed Poutz 

AND 

WlLLABD SlMliONS 
PABTI 

This paper describes a plan for eliminating the need for call¬ 
backs. Each person in the sample is visited only once. From 
each person interviewed information is obtained as to whether 
or not he was at home on specific instances, including the 
instance of the interview, which permits an estimate of the 
proportion of time he is at home during the interviewing hours. 
Questionnaires are divided into e.g. 6 groups according to the 
estimated proportion of time persons in each group are at 
home, viz., 1/6, 2/6, • * -, 6/6 of the time. The sample esti¬ 
mate, for any variable under study, is produced by weighting 
the results for each group by the reciprocal of the estimated 
per cent that persons are at home. It is shown that under ceiv 
tain conditions this estimate is unbiased and the variance of 
the estimate is obtained. A numerical comparison is made be¬ 
tween this plan and the usu^ method of calling back. 


M any iNDiviDxrAiiS are not available for an investigation on the 
first visit because they are not at home when the interviewer calls 
on them. These cases are often referred to as the “not at homes.” De¬ 
pending on the time when interviews are feasible and on the kind of 
individuals imder question, the percentage of “not at homes” usually 
varies between 30 and 60. The “not at homes” thereby constitute a 
factor of extreme importance. The simplest theoretical device for the 
completion of the sample consists of revisiting, again and again, the 
homes where a certain individual was not found on the first call, until 
the particular individual is foxmd. These callbacks are spread thinner 
and burdened with longer travel time than first-call interviews. The 
second visit is more expensive than the first visit, the third visit is more 
expensive than the second one. The economic burden increases with 
subsequent calls to the point that a certain percentage of attempted 
interviews usually is considered unobtamable. The increased costs per 


9 



10 AMEBICAK STATISTICAL ASSOCIATION JOTTHNAL, MABCH 1949 

information unit derived from callbacks make it, in most cases, advis¬ 
able not to attempt revisiting all the "not at homes” of the primary sam¬ 
ple, but to revisit only a sub-sample of it.^ While sub-sampling increases 
the sampling error, it still need not introduce biases. The biases start 
only where revisiting of "still not at homes” stops. With the callbacks as 
a major source of expense in unbiased population samples, it has been 
worthwhile to study the possibility of circumventing the need for them 
altogether. During the past three years, we have developed a plan for 
eliminating the need for callbacks and several experiments have been 
made applying this plan to market surveys.* 

The step may be a review of the meaning of the "not at homes.” 

1) If the survey is concerned with items open to observation within the 
household or with items about which nearly every adult member of the 
household can give information, it usually suffices to design a sample 
of households. An investigation of a household in the sample becomes 
impossible if nobody is at home (more accurately, no adult is at home) 
at the time when the interviewer rings the doorbell. 2) If the investiga¬ 
tion is concerned with problems where an individual reports about him¬ 
self (buying habits, opinions on social and political issues, taste prefer¬ 
ences, etc.), a sample of individuals is designed. Under these circum¬ 
stances, it no longer suffices that somebody (some adult) is at home. 
A particular individual has to be found. If this particular individual is 
not at home when the interviewer rings the doorbell, the information 
cannot be obtained. This paper is concerned solely with samples of in¬ 
dividuals. It is obvious that the not-at-home rate in samples of indivi¬ 
duals is higher than the not-at-home rate in samples of households. 

If several callbacks are made in order to reach individuals not found 
at home on the first visit, the assumption is maintained that the indi¬ 
vidual not at home at one inetance^ will be erf home some time during those 
hours when personal visits are possible. Individuals who are at home 
only during night hours, let’s say from 10 p.m. to 8 a.m., drop out of 
almost every sample. By leaving such inaccessible extremes aside, we 
may say that people who are not at home at the first visit but can be 
found in a second, third, fourth, fifth or sixth visit, are persons who 


* A plan for anb-fiamplizig non^reBponsos to mail (xoastionnaiies is discussed by Moxzis H. TTut^aiyn 

and William N. HurvitSi Tht ProUam of Non-Responu in Samjie Surveys, Journal of the ATnftriftn.w 
Statistical Association, December, 1946, page 517. The pdncipleB set forth for detennining the siae of 
the original sampite and the aise of the sub-sample of non-responses to efBcienciy for 

a given cost are applicable to the *not at home* problem. 

* It has recently been bron^t to the authors* attrition that a somewhat ai'wnilft.r was proposed 

indej>endent]y by H. O. Hartley before the lto 3 ^ Statistical Society. This proposal was in com¬ 
menting upon a paper by F. Yates, A Review of Recent Devdojmtente in SampUngand Samplino Surveys, 
Journal of the Koyal Statistical Society, VoL CIX, Rut 1,1946, page 87. 



11 


BUMINATING “CALLBACKS” 

stay away from home more often by varying degrees. The average fre¬ 
quency of staying away from home among the second call respondents 
is higher than among the first call respondents; the average frequency 
of staying away from home among the third call respondents is higher 
than among the second call respondents, and so forth. Because of the 
fact that some people are away from home more often, it becomes 
necessary on the average to visit their homes more often before they 
can be found at home. But at one time if callbacks are continued in¬ 
definitely, they actually are found by the interviewer. Let’s say an 
interviewer finds the respondent, Mr. Smith, at the occasion of the 
third call at 8 o’clock in the evening on Wednesday. If the survey sched¬ 
ule and interviewing ass^mnent had acidentally brought the inter¬ 
viewer to the home of Mr. Smith in the first place at 8 o’clock on Wed¬ 
nesday night, he would have been found at home at the first visit. Mr. 
Smith then never would have belonged to the group of “not at homes.” 

This may make it obvious that every sample of “at homes” must in¬ 
clude “potential not at homes.” To put it more accitrately, every set of 
first call interviews of timing A must include respondents who are 
“not at homes” in another set of interviews of timing B. It must include 
respondents who are not at home in timing C and it must include re¬ 
spondents who are not at home in timing B and timing C. Statistically, 
therefore, it must be possible to reconstruct from a present “at home” 
sample, past samples of “at homes” and “not at homes,” if: (a) re¬ 
spondents provide information on their past “at home” performance, 
and (b) if the individuals in the present “at home” sample are viated at 
times chosen at random. 

Conmder, for example, the following three groups, among which all 
individuals in the population are distributed: 1) those who are at home, 
on the average, 20% of the time, 2) 50% of the time, and 3) 80% of the 
time. If the time of visits is determined at random, we would expect to 
find on the first call about 20% of group (1), 50% of group (2), and 80 
of group (3). Now if each person m the sample can only be identified 
>vith the group to which he belongs, a correction for the under-repre¬ 
sentation of each group is clearly indicated. Since only about one-fifth of 
the persons in the first group are interviewed, this group is assigned a 
weight of 5. Likewise, the second group receives a we^t of 2 because 
only about half of the persons in this group are found at home, while 
the third group receives the we^t of 1.26. This wei^ting, of course, 
does not completely eliminate the bias, for it takes into account only 
three arbitrarily defined groups. On the other hand, the bias must be 
reduced because the weighting has at least partially compensated for 



12 


AHKBICAN STATISnCAl. ASSOCIATION lOOBNAL, MABCH 19» 


the imder-reinres^tation of persons frequently away from home. 

The number of such groups, however, need not be restricted to three. 
With obvious modifications, the above example is applicable to any 
number of such groups into which the population might be divided, 
where each group cont£uns persons who are at home any part of the 
time during which interviewing is in progress. Consider the limiting 
case in which there are as many such groups as there are persons in the 
population; that is, each person is a group of one. If it were posable to 
assign to each individual found at home, a weight equal to the recipro¬ 
cal of the per cent of time he spends at home, the not-at-home bias 
would vanish.* Such weighting would completely compensate for the 
under-representation of persons frequently away from home. 

EsHTnaUng the Per Cent of Time Respondents Are At Home 

While it is hardly possible to find out the exact percenter of the time 
an individual is at home, it has proven feasible to estimate this percent¬ 
er from information obtained by direct questions to the respondent 
himsdlf. The problem of phrasing such questions has naturally received 
considerable attention and experimentation. Any such question as, 
“Are you usually at home in the evening?”, of course, is valueless. A 
more specific question, such as, "How many ni^ts out of the last five, 
were you at home?”, is much better, but is stfll subject to two objec¬ 
tions: 1) the respondent is likely to answer without thinking back over 
his activities on the previous five evenings, and 2) no provision is made 
for the respondent who was at home during part of an evening and 
away for the remainder of the evening. While further improvements are 
posable, the following questions have proven to be satisfactory: 1) 
Would you mind telling me whether or not you happened to be at home 
last nig^t at just this time? 2) How about the ni^t before last at this 
time? 3) How about Wednesday ni^t? 4) How about Tuesday ni^t? 
6) Monday ni^t? These questions rdate specifically to interviews con¬ 
ducted on Saturday and the particular days of the week mentioned in 
questions 3, 4 and 5 are changed, as appropriate for interviews con¬ 
ducted on other nights. To alleviate any possible resentment at the 
personal nature of the inquiry, interviewers find it helf^ul to preface 
the questions by some statement like, "We are also interested in finding 
out how oft^ people go out in the evening at various times and on 
various days of the week. I wonder if you would^mind^telling me... 
etc.” 


* See Atrt U, peee 22, footnote 10. 


BUinNATINa “CAIiLBACKS” 


13 


It is important to decide upon an optimum number of nights about 
which inquiries should be made, bearing in mind the limitations of re¬ 
spondents’ memories and willingness to cooperate, as well as the obvi¬ 
ous advantage of having information for as many nights as posable. 
Experience with this technique in several field operations has led to the 
acceptance of information for five previous ni^ts as perhaps the most 
suitable, where evening interviews are conducted Monday through 
Saturday. Since the intreviewer obtains information about one night 
by observations, this makes a total of six nights upon which an esti¬ 
mate may be based of the actual per cent of time respondents are at 
home. A dear advantage lies in the fact that six nights coincide with a 
complete week of interviewing, and the effects of any tendency of re¬ 
spondents to go out more frequently on certain nights of the week is 
eliminated. Experience with this plan has indicated that respondents 
are almost always able and willing to give answers for five ni^ts previ¬ 
ous. This is borne out by the extremdy small number of non-responses 
and “don’t knows” to these questions. Because the questions relate to 
an individual’s personal activities, it is the t3/pe of information re¬ 
spondents do know, and are not reluctant to impart. 

Sample ProjecHms Based on Unbiased EstimcUes of ihe Time Respondents 
are At Home 

Information concerning the number of nights each respondent is at 
home, out of six specified ni^ts at a particular time, provides an un¬ 
biased estimate of the actual per cent of the time each respondent is at 
home.* This information makes it possible to divide the respondents 
into six groups, each having a weight depending upon the proportion of 
such individuals expected to be found at home, as follows: 


(1) 

Group 

(2) 

Estimated proportion 
of tune spent at home 

(3) 

Weight 

1 

1/6 

6.0 

2 

2/6 

3 0 

3 

3/6 

2 0 

4 

4/6 

1 6 

5 

6/6 

1 2 

6 

6/6 

1 0 


The weights in column (3) are the reciprocals of the estimated pro¬ 
portion of the time individuals in each group are at home (column 2). 


< See Fktt II, pace IS. 








14 AMEBICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

Infonnation obtained from interviews with individuals in each group 
may be multiplied by the corresponding group weight to produce an 
estimate for all individuals in the group, of those originally in the sam¬ 
ple, including persons not at home. There is one group, however, in 
the original sample, from whom no interviews have been received; 
that is the group containing those persons who were not at home on any 
of the six nights, including and preceding the night the interviewer 
called at their homes. This group, of course, is comparable to the group 
which is not reached even after five callbacks. 

If information is obtained for six nights at random, estimates based 
on this technique are subject to no not-at-home bias other than the bias 
contributed by the relatively small group who are not at home on any 
of six selected nights.® As far as bias is concerned, therefore, such an 
estimate is equivalent to that obtained by a sampling plan in which six 
calls are made when necessary to find the “not at homes.” In making 
this comparison, however, some allowance must be made for the addi¬ 
tional contribution to the sampling error because of failure to obtain 
interviews from any “not at homes.” In actual practice, it has been 
found that this increase in sampling error seldom exceeds 2% (coeffi¬ 
cient of variation). Moreover, for almost all situations likely to be- 
encountered in practice, the use of this technique in combination with 
only one callback will produce more reliable estimates, considering 
both the bias and sampling error, than may be obtained from a five- 
callback interviewing operation which does not provide for information 
on previous “at home” peifoimance. 

The great expense of repeated callbacks, however, has already led to 
extensive sub-sampling of the “not at homes.” This procedure, devel¬ 
oped by the Census Bureau, has gained wide acceptance by scientific 
samplers, because it frequently produces closer estimates for the cost. 
In a sub-sampling operation, however, consideration must also be 
given to increased sampling error. The addition to sampling error, oc¬ 
casioned by a sub-sampling operation, is likely to be greater than the 
increase resulting from use of the proposed plan, even assuming an op¬ 
timum allocation of interviews on first call, second call, and subsequent 
callbacks. In fact, the numbers of persons, discovered by inquiry, who 
are at home one night out of six, two nights out of six, etc., wdll usually 
correspond fairly closely with the optimum allocation of callback inter¬ 
views, if this information had been available. This assumes, of course, 
that the optimum numbers of interviews to be obtained on successive 
callbacks are determined after taking into account the increased costs 


(See n, page 21. 


15 


EUMIKATINa "CAIiLBACEB” 

of obtaining interviews after repeated callbacks. It would seem, there¬ 
fore, that for population studies, employing samples based on prob¬ 
ability theory instead of the inevitable errors in judgment, the use of 
this plan will yield impressive economies of both time and money. 

While this paper uses the past 5 days’ performance with regard to 
being at home as the basis for the development of strata, workers who 
want to use this method may deal with situations in which a amallftr 
number of days is justified. The decision they have to face is aiTinilftr to 
deciding on the maximum number of callbacks to be made on “not at 
homes.” On a special survey in which the writer had to get a measure¬ 
ment of the biases in a judgment sample, it was considered necessary 
to make up to eight callbacks. Lot survejrs where less precision is re¬ 
quired, three or even two callbacks are set as a maximum. It is impos¬ 
sible, without reference to the particular subject under study, to make 
a final statement about the number of callbacks necessary, or about its 
approximate equivalent; that is, the number of days’ past performance 
that should be covered. 

While the mathematics of the “not at home” calculation is explained 
in Part II, one point, of a psychological nature, may require a reference 
from practical experience. It is the question mentioned before as to 
whether respondents can report about their past at-home performances 
with sufficient accuracy. There is no doubt that many survey questions 
burden the memory and honesty of the respondent much more than 
the at-home question. However, it would not be good policy to take 
possible inaccuracies in sampling lightheartedly just because surveys 
as a whole have their weaknesses anyway. It is for this reason that 
actual field experiences should be quoted. Following small scale experi¬ 
ments which cleared the path, the method has been employed ^ce the 
latter part of 1947 in major area surve3rB. 

It is well known that men and women differ substantially in their at- 
home performance. Therefore, in a survey of the Philadelphia metro¬ 
politan area, the proportion of females and males in the population 
was left to discovery by the survey using the new device. 


FHILADELI>HrA METBOFOLrTAN 


Per cent males directly 

Per cent males estimated from 

Per cent males estimated by 

counted in at-home samide 

at-home quesHons 

IT. S. Gensus, 1947 

43.1% 

47,8% 

47.4% 


The difference between the survey estimate and the Census estunate is 
well witliin the sample tolerance. 









16 iJABBICAK STATISTICAL ASSOCIATION JOXJBNAL, UABCH ig«> 

An internal check on response reliability is provided in this plan 
without resorting to comparisons with Census data. On the one hand, 
a record may be kept by interviewers of the number of persons found 
at home and the number not at home of all persons visited. This permits 
a direct estimate of the per cent of persons who are at home when the 
interviewer calls. On the other hand, the expected per cent of persons 
at home at a given time chosen at random is equivalent to the average 
per cent of the time all persons are at home. This may be estimated 
directly from the information obtained from respondents concerning 
the number of nights each respondent was at home out of the past six 
nights. This comparison between two independent estimates of the 
average per cent of persons at home is usually made in order to check 
the over-all accuracy of respondents’ answers, interviewers’ records and 
and any other source of error. The results of this check for the Chicago 
survey are as follows: 


CHICAGO METBOrOUTAN AltWA 



At home 
percent 

Not at home 
per cent 

Based on actual interviewers* records of number 



of persons visited and number found at boms 

61.1 

88.9 

Based on respondents* answers concerning the 



number of nights they axe at home 

61.5 

88.6 


It is advisable for anyone who may use the method in the future to 
maintain a direct coimt of the individuals not found at home, althoi^ 
it does not add to the information which is sought by the survey. How¬ 
ever, the direct count provides an elegant internal statistical check on 
the reliability of the response to the not-at-home question, and thereby 
indirectly, a check on the interviewers’ carefulness in dealing with the 
respondents. 










FURTHER THEORETICAL CONSIDERATIONS 
REGARDING THE PLAN FOR ELIMINATING 
CALLBACKS 

PABT n 

T he plan pob eliminating callbacks described in Part I may be sum¬ 
marized briefly as follows: 

1) Each person in the sample is virited once and only once at a 
time determined at random, considering only the periods during 
which interviews are to be conducted. 

2) From each person interviewed, information is obtained as to 
whether or not he was at home at six specific instances, deter¬ 
mined at random, including the instance of the interview, which 
permits an estimate of the proportion of time he is at home 
during interviewing hours. 

3) Questionnaires are divided into sis groups according to the esti¬ 
mated proportion of time persons in each group are at home, 
viz., 1/6, 2/6 • • •, 6/6 of the time, for groups one to ax, re¬ 
spectively. 

4) The sample estimate, for any variable under study, is produced 
by wei^ting the results for each group by the reciprocal of the 
estimated per cent of time persons in the group are at home. 
Thus, the wei^ts for groups one to sis are, respectively, 6/1, 
6/2 - • •, 6 / 6 . 

AsmmT^iom Made 

The population to which the sample estimate relates is restricted to 
those individuals who are at home at least some time during interveiw- 
ing hours; that is, those persons who could eventually be found by call¬ 
backs during regular interviewing hours. The decision concerning the 
hours of interviewing is, of course, extremely important to the survey 
results, and the shorter the daily interviewing period, the larger the 
number of persons arbitrarily given no chance of being found at home.* 
For example, employed persons thus may be excluded from da 3 rtime 
interviews, persons attending ni^t school may be excluded from even¬ 
ing interviews; whereas, neither group is excluded from an interviewing 
schedule including both da 3 rtime and evening hours. In the limiting 

* Because more i>eE 80 xiB axe usually at home at oertaui times tha& at other timest farther consider- 
ation of the ^^ptimum* periods during the day for interviewing may be worthwhile* possibly leading to 
a etratiifioation by time of day and of the prindples for optimum aUocation of sample cases to such 
strata. 


17 




18 


AMERICAN STATISTICAL ASSOCIATION JOTTRNAL, MARCH 1948 


case, the excluded group consists only of those persons who are never 
at home, or who have no home, since interviews are theoretically pos¬ 
sible at all hours. In the following discussion it will be convenient to con¬ 
sider an experiment in which each of the N persons in the population is 
visited one time. Because the not-at-home problem is no different for a 
probability sample than for an attempted total census of the entire 
population, this involves no loss of generality. Let us assume that inter¬ 
views are obtained from all of the n persons who are found at home, i.e., 
no person refuses to be interviewed, and that (N-^) persons are not at 
home when the interviewer calls. Now in effect we have a sample of n 
individuals in which the probability of including any person is equal 
to the probability that person is at home when the interviewer calls. 

The random choice of a time of visiting each person is to avoid the 
arbitrary exclusion of persons who are never at home, for example, at 
the time of day and day of the week at which, otherwise, it may be de¬ 
cided arbitrarily to visit them.^ When an interviewer rings a doorbell, 
he is sampling time. He has chosen at random one particular moment 
from a large number of possible moments to ascertain whether or not 
the respondent is at home. The chance that an individual is inter¬ 
viewed, therefore, is exactly equal to the per cent of the time that indi¬ 
vidual spends at home, counting only the hours during which mter- 
views are conducted. Moreover, for each of the n mdividuals who do 
happen to be at home, the interviewer obtains a sample of five addi¬ 
tional points in time. The questions suggested in Part 1, together with 
the interviewer’s observation at the time of his visit, provide a system¬ 
atic sample from a random start of a cluster of six moments spaced 
twenty-four hours apart. It may be easily diown that this sample pro¬ 
vides an unbiased estimate of the per cent of time each individual is at 
home. Let equal unity if they** person is at home at the fc** moment, 
otherwise pjk equals zero. Since the moments were selected with equal 

probability, it follows that py=l/6 Spii is an unbiased estimate of 
pj, the actual per cent of time that the y** individual is at home {pj 
= pjkf where M is total number of moments of interviewing.)* 

7 A random dioioe of the time of callins on any individual maybe approximated closely in actual 
operations \rithout ereating any severe administrative problems. It does not xequire, for mcamplet that 
the sdection of a time of visiting different individuals be independently at random. It is quite feamble 
to select first at random an evening on whidiall persons in a particular location or duster vrillbe visited, 
to order the visits for convenient travel between homes, sdedang at random only the point within the 
duster at which the interviewer is to bes^n malring calls at a spedfied time. While not strictly meeting 
the requirements of randomness, this procedure results in a fairly dose approximation. 

• A moment is chosen merdy for convenience. jLctually any other unit of time would serve as well 
provided it is understood that the interviewer does not wait for respondents to return home but simply 
tahesthetimeneoessary to ascertain whether or zu>t the respondent is already at home when the door¬ 
bell rings. If one prefers to consider the unit of time lufinitcmmal, k be drawn to be unbiased by 
employing the integral fpikM, 




BuiimATmo “callbacks” 


19 


We may further define as the average per cent of tinie 

all N persons in the population are at home. 

Since information concerning the time each person in the sample is at 
home is obtained for six moments^ it is convenient to think of the popu¬ 
lation as being divided into seven groups which correspond to the actual 
and potential answers regarding the number of nights they were at 
home. The following seven groups are, therefore, defined: 


TABLE I 

STRATA BASED ON THE NUMBER OP NIGHTS ON WHICH INDIVIDUALS ARE 
AT HOME AT A SPECIFIED MOMENT 


No. of sights at 
at home out 
of six 

Proportiou of 
time at home 

Sise of 
pop. 

Estimated 
Sise of 
pop. 

ffixe of 
sample 

F6p. 

total 

Sample 

total 

(1) 

Est. 

Act. 

0 

■ ■ 


N. 


_ 

Xa 

— 

1 


Pi 

Ni 


m 

Xi 

“$1 

2 


P« 

Nt 

K 

ni 

Xt 


3 


Pi 

N, 


ni 

Xa 


4 

4/6 

P* 

Na 


m 

Xa 

Si 

6 

6/6 

Pi 

Na 


nt 

Xa 

‘Sa 

6 

6/6 

Pi 

Na 


m 

Xa 


Groups one through six 

V 

N 

3 ? 

n 

X 


Groups sero thiou^ six 


Ni 


n 

Xt 

—- 


The size of the population in each group, Ni(i =1 , 2, • • •, 6) is, of course, 
unknown. However, the Ni are rigidly de^ed. For example, the Ni 
persons in Group one include all persons whose answers indicated that 
they were at home only on the night the interviewer called and on none 
of the other five nights, at the specified time. In addition, the popula¬ 
tion in Group one includes all persons, not at home when the inter¬ 
viewer called, who could have truthfully given such an answer for the 
previous five nights if they had been asked the questions at the par¬ 
ticular moment that the interviewer called at their respective homes. 
The zero group is made up entirely of the latter, since a person who was 
not home at least one night in the specified six could not fall into the 
sample. The seven groups are mutually exclusive and exhaustive and 
are sampled independently except for the zero group which is not 
sampled at all. The other six groups may be properly treated as strata. 
It is assumed that the population is large enough and that the dis¬ 
tribution among the population of the variate pj is such that the sample 
does include some persons in each of the groups one to six. It will be 
noted that the Nt are precisely defined only after the time of visiting 


























20 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 


each respondent is fixed, and that a change in the schedule of calls wiU 
change the numbers and identity of the persons falling in each of the 
seven groups. In this sense, the Nt are variates, having a sampling dis¬ 
tribution, which assume definite values whenever the time for visiting 
all respondents is selected. With these conditions in mind, consider the 
sample estimate (^) of the population total (X) for the characteristic 
understudy. 

(4) t-et-?- 

/-I » 

■Where xj is the value of the variate under study for the person, n is 
the number of persons found at home and i is the munber of nights any 
individual is at home out of the six nights including and just preceding 
the night of the interview. Thus, the value of the variate for each indi¬ 
vidual in the sample is weighted by the reciprocal of the estimated per 
cent of time he is at home. 

The usefulness of this entire technique for eliminatiD^ callbacks de¬ 
pends largely upon whether or not ^ is a “good” estimate in the sense 
that: (1) it is unbiased and (2) has a small sampling error for the class 
of population for which it is appropriate. 


ilfean of the Sample Estimate 
It follows from equation (4) that: 

( 6 ) 

« /-I t 


"Where pi is the probability that the individual is at home when the 
interviewer calls, and EjXj/i denotes the conditional expectation of 
Xj/i knowing that the j* person is found at home and interviewed. The 
value of Xj/i, for each person interviewed, depends upon how many 
ni^ts out of the previous five ni^ts that person was at home. Thus 
we have. 


( 6 ) 




i-a i 


51 

(i - 1)1(6 - i)! 


?/*-»(! - ft)®-* 


in which the coefilcients of Xf/i in the dx terms of the summation are, 
respectivdy, the probability that the j® person was at home 0, 1, 2, 
• • •, 6, previous ni^ts; viz., the six terms in the expansion of (g/+ft)® 



21 


BUMINA'nNO "callbacks” 

where 5/=1 —pj. Equations (5) and (6) readily yield for EX-* 

(7) “22 

y«»i 

where Pa is the probability that the y* individual falls in the »** stra¬ 
tum, i.e., 

” T z-r • Ml ~ 

^!(6 — i)I 

is the (i+l)*^ term in the expansion of (gy+fo)*. Since the sum of the 
seven terms in this expansion equals unity, we have, 

(8) 2.Pq = l-a/*. 

Making this substitution in (7), the expected value of becomes: 

Vi Vi Vi 

(9) “ 2 »y(l - 2/*) = 2 “ 2 2#‘»y- 

y-i y-i /-I 

But g/ is the probability that the j* individual will not be at home on 
any of the dx specified nights. The second term of (9), therefore, is the 
expected value of the variate for the zero group, thus, 

Vi Vi 

(10) EJf = s »y - 2 2y*®i = - EXi = X. 

y-i /-I 

It is dear that is an unbiased estimate of X, the total value of the 
variate for all persons in the population other than those who cannot 
ordinarily be found at home in six vMts. Although X is a constant, the 
Xi will vary in successive samples according to the number and identity 
of the individuals whidi fall in each stratum for a particular arrange¬ 
ment of the interviewers’ schedule of visits. As a corollary of the above 
proof, however, it can be diown that is an imbiased estimate of the 

* The assumption is implied that the six moments axe independently sdected at random. If the 
particular questions suggested in Part 1 are used, the six moments in question are not independently 
selected at random, but 83 rstematioally selected within a randomly selected duster of six successive 
nii^ts. An exact statement of probability would involve the intra-dass oorrdation of the probali^ty 
of an individual being at home on sucoessive xughts. However, experience has indicated that this oer- 
rdation tends to be quite low or even negadve because a person is more apt to stay at home on a nidrt 
fdlowing a night on which he goes out. The assumption of a aero coxxdation is, therefore, realistic. 




22 AMBEICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

aggregate value of the variate for the stratum, i.e., 

By definition, we have: 

iVt 

( 11 ) Xi = 23 

It follows directly that 

Ht 

(12) EXi='£PijXi. 

Since this expression is identical to that ^ven within the sununations 
(»=1, 2 • • • 6) in equation (7), clearly EXi=EXi. It is also evident 
that and that ER=N. 

Brief attention must be given to the zero group which inevitably 
contributes a bias to the results of any population survey. Any estimate 
for this excluded group is necessarily based on an assumption, for 
example, that 'Xo=^i, or where the variate under study is highly cor¬ 
related with the tendency to be away from home, the assumption 
might be that Xo=a+&P 0 ] iu which (a) and (b) are the constants in 
the regresfflons between (x) and (p) as determined from the sample. 
This latter assumption might be warranted if the zero group is large 
and one is estimating, for example, the number of ni^ts each week 
individuals go to the movies, or listen to the radio. The regression esti¬ 
mate has one dear advantage: the extent of the bias because of "not- 
at-homes” depends upon the correlation between (p) and (x). If they 

** A Deceaaary oonditioii for K and the to be unbiased is the random selection of a time for 
eadi visit. If the xniznber and identity of the Ni individuals in each stratum were determined, for 
example, by an arbitraiy arrangement of the interviewers’ schedule of visits, the ‘Si and S become 
biased estimates. Under these conditions 

#1 * 


t ifi . 




where equals unity or sero according to whether or not the /th person in the tth stratum is at home 
at a moment taken arbitrarily. 

If the exact probability ^ were known for all the persons and the visits are made at a random time 
the estimate 



is an unbiased estimate for the entire population including the sero group, for 




ELnmNATiNa “oauiBacks’’ 


23 


are unconelated, the bias vanishes. Where the correlation in the popu¬ 
lation is linear, this estimate is unbiased. It is not, however, susceptible 
to proof from the sample itself that the relationship is linear within the 
zero interval even though extremely strong evidence is afforded by the 
other ^ group. The problem of estimating for this group is no differ¬ 
ent than that found in a callback operation. The size of the zero group 
and the consequent bias will be reduced, of course, by discovering 
whether or not respondents are at home on more than five previous 
ni^ts, just as it is reduced by making more than five ctdlbacks. 

Variance of the Sample Estimate 

Our next interest is in the variance of the sample estimate, o*x, 
where the sample, according to the conditions of the experiment, de¬ 
pends entirely upon the niimber and identity of the prsons at home 
when the interviewer calls. For the more usu^ case in which the entire 
group of Nt prsons, itself constitutes a sample from a larger popula¬ 
tion, becomes a separate contribution to the total variance. Sup¬ 
pose that is used as an estimate of X, the aggregate value of the 
variate for a population out of which the Nt prsons have been selected 
as a sample. The total variance of this estimate is given by: 

- f)* = - Z) -f (X - ^]* 

= - Z)* + Z(Z - = Z®(<r*2 + (r*r) 

where irz is the contribution to the variance arising out of any other 
sampling operations. 

Our main interest, therefore, is in the contribution to the variance 
ariting from the elimination of callbacks. By definition and by equation 
(10) we have, 

(13) v*z = - (Zl)* = El* - Z>. 

From equation (4) the E^ is ^ven by, 

(14) El‘ = -sfeE 36Z2 ^ + 36E S * 

\ j-i »/ j-i ** * 

The first term of (14) may be evaluated directly as in equation (6). 
JL Zi* JLi 

(16) 36Z2: -T = 36 E m -ir 

in which, as before, Ei denotes the conditional expctation knowing 
that thei*^ individual is interviewed. Hence, 



24 


AMEBICAIT STATISTICAli ASSOCIATION JOUBNAL, BIABCH 1949 


(16) 






5! 


S »* (i-1)1(6-*)! 




Equations (15) and (16) leadily ^deld 

PiiXj* 

M i-A 


(17) 




when Pij is defined following equation (7) above. 

The second tenn of (14) is the sum of all possible combinations, 
taken two at a time, of the wei^ted variate (Xj) for persons fallii^ in 
tixe sample. The expected value of this term, therefore, is the sum of all 
sudi possible combinations for the entire population multiplied by 
their respective probabilities of occurence. Hist, we note that the 
probability that the y'* person is at home on any ni{^t in no way effects 
the probability for the 4“ person, that is, xj and Xh are independent and 
ExjXic=* (Exi)(Exk). Thus, for the second term of (14), we have. 


36i& 


XjXt 




(18) 


= 36 E 


The first term of (18) has been shown to be equal to Z* (equations (5) 
through (10), above), ^iinilarly, it was shown that. 


(19) 6?jE;^ = «Xl -{,«). 

These substitutions for the second term of (14), together with the li^t 
member of (17), for tiie first term of (14) produce the following ex- 
presdonfor^l^: 

(20) EX* = 6E E —+ X* - E»i*(l - 2f*)*. 

/—I ^ y—i 

Substituting in (13) and simplifying, we have, 

(21) v*x = E »i*|6 E —r — (1 - £#®)*1 • 

iUl 1 M ^ J 



25 


BUMINA'nNQ “CAIiLBACKS" 


In almost all samplii^ problems of the type for whidi this technique is 
appropriate, the variances in the population are unknown and it is 
necessary to substitute values from the sample to compute the sampling 
error of the sample estimate. In this case pt=k/6 is the sample estimate 
of Pi for all persons in the ft® stratum and The sample 

estimate of 0 ^ then becomes, 

( 22 ) 

W y-1 K L i-i i J 


where is the sample estimate of P^y obtmned by substituting pt and 
Qi for Pi and £y. Since there are only six possible values of pt and as 
listed in Table I, the corresponding six values of 



- (1 - 




have been worked out for use in estimating the variance of any sample 
estimate based on this plan. They are drown in Column Three of Table 
n. The sum of Column five is the sample estimate of the variance, 
«*x. 

TABin n 


Stratum (.fiT) 

(1) 

6/K 

iV 

Ak 

m 

Sk* 

(4) 

t 

1 

6 

16.160 

Sx 

16.160St 

2 

3 

6.957 

8t 

6.9575* 

3 

2 

2.802 

89 

2.8025, 

4 

1.5 

1.027 

8x 

1.0275« 

5 

1.2 

.805 

S. 

.3055, 

6 

1 

0 

S. 

— 


•St 





Numerical Examples 

To gmn ind^t into the probable effects of practical applications of 
this plan, it may be helpful to consider a hypothetical population and 
the expected sample which would result. Such a population is shown in 
Table m. The probability density function of the distribution of pi 
in this population is ZpHPi- This function was selected primarily for 
convenience in computing the expected numbers at home 1,2, • • •, 6 
ni^ts out of the six in question. It does, however, roughly approximate 
a t3ridcal distribution which might be inferred from actual answers to 
the questions r^arding nights at home. 






TABLE III 

HYPOTHETICAL DISTRIBUTION OP THE POPULATION AND EXPECTED SAMPLE BY PERCENT OP INTERVIEWING HOURS 
PERSONS ARE AT HOME, AND THE EXPECTED NUMBER OP NIGHTS AT HOME OUT OP SIX SPECIFIED NIGHTS 


26 


JUSmaCAN STATISnCAL ASSOCIATION’ JOUBNAIi, BIABCH 19 


s 



.80 

.40 

.60 

.60 

.70 

.80 

.90 


I iKjJJ.Ii 

o THt^T-ToooQeo 

o iHcot->-ifr-iOeo 
o »-( iH N eo 

1 

§ i§S§ii 
g “»■ a g 3 s a 

b. »H 04 eo 

s s 

1-1 

1 

R ' ® S S 

04 04 

i 

s 

i , «lil§ 

B * S S 

04 04 

is 

S '*gSSSSSS 

o tH eo oi 04 «o 

g « • fe s s 

to 

a 

3 

i »|gi§l 
g '* a 8 s 

70%- 

80% 

S 

O 

^ 04 ^ S «0 

i 

1-4 

-4* 

S giiSii 

bT 1-4 o 04 m iH 

04 1-4 eo le eo 

is 

S 

O C4«0s^'0<^04e00 

►T C4 cT 1-7 iH O 

04 04 00 iH 

§ 

i 

B alllSi 
s « 3 a a s 


o 04 o flo^b- b- c» go 

Q t^C3t»C4*0<iOO 

S 

*i7 »0 «0 bT lO 04 04 

C> iH 04 04 iH 

lO 

s 

i iiSlii 

g » S S S " 

is 

o oggoocdeoeo'^ 

O 0>«>b.Ol^eQCD 

O 

1-1 1-4 00 <0 00 1-i GO 

CO iH iH 1-1 

eo 

CO 

CO 

8 S g S 3 8 a 

CD ®* *® ®l *® 

iH MO 09 b- CO 

04 

CO ^ 

I siiSlia 

oT 00 ^ 00 CO 

CO « 

ae 

So 

a 

CO »H CO "CC 04 

1H 

is 

I §lisgl* 

O CO «0 to 04 

?H 

1 

<4« 9-7 iH 1-7 

is 

1 1 
b7 04 04 1H 

lO 

3 1 1 g «5 - , 

v-C 

0-10% 

2 2 a a * 

S S S II 

iH 

lO 

s 

MO o 1 H 141 

^ «« III 

Expected No. 

Of NiehtB 
Home out of 
Specified Six 

O 1-1 04 eo <41 lO CD 

1 

v4 04 eo ^ 80 CD 


1 

1 

.s 

i, 

II 

1 

1 



































mT.TMTN -ATINQ “callbacks” 


27 


The upper half of Table III shows the relation between the actual 
percentage of time persons are at home and the expected numbers at 
home 1, 2, • • 6 nights out of six. The marginal distribution of pj 

for the entire population (first line of Table III), is obtained by inte¬ 
grating successively 

3JV* J* pi^dpj (0 g a g 6 g 1) 

where a and b are the percentages shown in the heading. The average 
per cent of the time all persons were actually at home (p) is given by: 

(23) p = Epj = 3 r (pi)p/dp] = = 75%. 

J 0 


The product, Nt multiplied by this integral between the limits (a) and 
(b), wiU yield the expected number falling in sample, who were at home 
between a and b per cent of the time. When (a) and (b) are taken suc¬ 
cessively in intervals of 10%, the distribution of (Pj) in the sample is 
obtained (line opposite “Expected Sample” in Table III). Similarly, 
the expected numbers not in the sample are obtained from the integral 

'p,\l - p,)dPi = 3iV^,| J - 

The expected numbers, in the population, at home 1,2 • • •, 6 nights 
out of six are given by: 




(24) 


6! 

ENi = ZNt I —-— Ml - Pd^WPi 

J a *1(6 — »)! 

> 6 

S.SD.J. . 


3JV* J* p/Pijdpj. 


The expected numbers, for the population, in the main body of the 
table are obtained by using the successive limits (a) and (6) in intervals 
of 10% (*=1, 2, 3 • • ’ 6). The corresponding expected numbers in 
the sample are obtained from 


Ent »= ZNt 


( 26 ) 




51 


(» - 1)1(6 - »)! 


(pj)p,*-ki - Pi)*-^mi 


SNt 


m' 


Piip^dpj. 



28 AMEBICAK STATISTICAli ASSOCIATION JOCBNAL, UABCH 1949 

For any integral v^ue of i, we have, 


(26) 


Eui — — ENi, 
6 


Clearly this relation holds regardless of the limits of integration and 
the expected number in the sample in any “cell” is exactly equal to i/6 
multiplied by the corresponding expected number in the population. 
Therefore, by applying the weights 6/i to any group in the sample, we 
obtain an unbiased estimate of the corresponding group in the popula¬ 
tion. 

An interesting check, useful in practical applications of the plan, is 
afforded by comparing with n/Nt since each of these ratios is an 
estimate of the average per cent of time all persons in the population 
are at home. It is clear that n/Nt, the per cent of all individuals visited 
who were actually found at home, is an unbiased estimate, where calls 
have been scheduled at random. Because EU =ATthis comparison 
indicates the net error resulting from several sources including (1) bias 
due to the zero group (2) sampling error (3) response bias in reporting 
information on previous nights at home, etc. For the population in 
Table III, the bias due to zero group equals 


750,000 760,000 

988,095 1,000,000 


.767 - .76 = .007. 


Several other interesting comparisons can be made from this “model” 
population and expected sample. Consider the characteristic Xj. which, 
for example, is possessed by all persons in the population who are at 
home more than 80% of the time and by no one else. The expected 
value of the sample estimate ^ is 487,996 while the popula¬ 

tion value for the entire Nt persons equals 488,000. On the other hand, 
if the characteiistic under study were possessed by all persons in the 
population at home less than 20% of the time and no one else, ESI « 
X=4,884 while the population value for the entire Nt persons is 
X*=8,000.^^ This latter substantial bias (3,116=39%), of course, would 
also result from a sample in which five callbacks were made.This com¬ 
parison points up the fact that in extreme cases where characteristics 
are possessed almost exclusively by persons who go out frequently, 


la many situations, it may be preferable to use the ratio estimate "S*^iNS/N) instead of S. 
Where the primary interest is in the per cent (Xf/Nt). the estimate (S/lf) ••(S’/Nt) is often to be 
preferred. This esiimate contains a bias, which is usuUy not large, arising from the random vaiiate 
appearing in the denominator. For the two examples, the estimate "Z* yields, respectively, 493,875 and 
4,943. 




29 


ELIMINATING “CALLBACKS” 

estraordioary effort must be made to reduce the not-at-bome biaa, 
In such a case, this plan may be coupled with—say two callbacks to those 
not at home on the first call, in which questions concerning previous 
not-at-home perfonnance are asked, as in the original calls. Application 
of similar projection procedure to these respondents will reduce the 
bias to about 30%. From the standpoint of sampling bias alone, there¬ 
fore, in an extreme case this technique together with two callbacks will 
reduce the bias of 39% obtained from five callbacks, to a bias of 30%, 
after eliminating three expensive callbacks. 

Obviously, the considerable advantage afforded by this plan in re¬ 
ducing bias must be modified to some extent by consideration of the 
contribution to sample error because of the failure to include all N 
cases. For the two examples just cited, this sampling error as estimated 
from the sample cases is .00059 and .0225 (coefficient of variation), for 
the cases, (respectively), in which the characteristic is possessed by 
persons at home more than 80% of the time and less than 20% of the 
time. 

In contemplating any particular survey operation, it is necessary to 
compare the possible benefits which might result from the use of the 
nights-at-home questions with those of alternative plans, as for example, 
provisionfor callbacks to only a sub-sample of the persons not at home. 
The sub-sample, of course, also contributes to the sampling error. Un¬ 
fortunately, there are few safe generalizations regarffing which plan 
will produce the most information for the cost. In a real sense, the two 
plans are equivalent, for the nights-at-home questions result in sub¬ 
samples of individuals who are usually at home about one night in six, 
two ni^ts in six, and so on. Moreover, the number of sample cases 
obtained for each of these “sub-samples” tends to correspond rou^y 
to the numbers that would have been obtained from sub-samples of 
callbacks that were allocated according to optimum conditions con¬ 
sidering the costa. That is to say, second call interviews cost more than 
first call interviews; and third, fourth, fifth and sixth call interviews 
become progressively more and more expenave so that the optimum 
allocation formula leads to smaller and smaller sub-samples of call¬ 
backs for each of these groups, respectively. This is because respondents 
found on callbacks tend to be more widely scattered requiring extra 
travel time and expense. 

Since the addition to the variance resulting from the use of the 
nights-at-home question is only one contribution to the oven-all vari¬ 
ance of the estimate, its importance depends upon the effidency of the 
ori^nal sam^e. E the ori^^ sample is both large and hiiddy efficient 



30 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

it may pay to obtain additional interviews from persons frequently 
away from home even at a high cost per interview. Experience indicates, 
however, that population samples are seldom eflScient enough to war¬ 
rant the extra cost. This naturally depends also upon the particular 
cost structure of the organization conducting the survey. 

Because of the effects of clustering respondents to save travel costs, 
efficiently conducted population surveys are likely to have more than 
twice the sampling error of a widely scattered unrestricted random 
sample of the same size. By making this assumption, it is possible to 
compare the results of this plan for eliminating callbacks. Plan A, 
with those of an operation providing for initial calls to 10,000 persons 
and up to five callbacks to find the not-at-homes, Plan B. The two 
samples are to be drawn from the population described in Table III. 
The characteristic imder study is possessed by about half of the popu¬ 
lation and is unrelated to the tendency to be at home. Under these 
conditions the most probable results of these two operations are shown 
in Table IV. 

The total number of home visits under each plan are equal. Inasmuch 
as all visits under Plan A are first calls, while about 4,500 visits imder 
Plan B are callbacks, Plan B must require more expensive field work. 
Since the sampling error for Plan B is larger than for Plan A, it is ap¬ 
parent that Plan A yields more information for the costs, under the 
stated assumptions of this example. If the original sample were twice as 
efficient as a random sample, the sampling error for Plan A would be al¬ 
most exactly equal to that of Plan B, one-fourth of one per cent. Thus, 
Plan A would still yield more information for the cost. The perfect cor¬ 
relation, pzN = l implicit in the assumption that the characteristic 
under study is not related to the frequency with which persons are 
away from home, tends to reduce the sampling error of Plan A some¬ 
what unrealistically. It is easy to construct hypothetical examples 
based on other assumptions to compare the two plans and possibly to 
appraise the effects of sub-sampling the original not-at-homes. 

Some allowance ^ould be made for the fact that interviewers are 
frequently able to find out from other members of the family when the 
absent member will be at home and by scheduling callbacks according 
to tlus information a higher percentage of persons may be found on 
subsequent visits. The primary difficulty here, however, is that call¬ 
backs necessitate travel to a neighborhood and it is seldom possible 
to schedule the visits to coincide with the various times that several 
persons to be interviewed in the nei^borhood are at home. In addition, 
no one will be found at home in many of the households visited and it is 
not easy to obtain reliable information from neighbors. Nevertheless, 



ELIMINATING “CALLBACKS” 


TABLE IV 


31 


COMPABISON OP THE PROBABLE RESULTS OF A CALLBACK OPERATION WITH 
THOSE OF A PLAN FOR ELIMINATING CALLBACKS 


Plan A. Nights at Home Questiozis 

Plan B: Callback Operation 

No. of 
nights 
at home 
out of six 

Number 

of 

visits 

Inter¬ 

views 

Number 

with 

diaraoter- 
istio (x) 



Number 

of 

visits 

Not-at 

homes 

Inter¬ 

views 

(1) 

(2) 

(?) 

(4) 

(6) 


(6) 

(7) 

(8) 

AU 

14,464 

10,849 

5,425 

7,146 

All calls 

14,464 

4,583 

9,881 

6 

4,822 

4,822 

2,411 

2,411 

let Call 

10,000 

2,500 

7,500 

6 

3,616 

3,014 

1,507 

1,808 

2nd Call 

2,500 

1,000 

1,500 

4 

2,583 

1,722 

861 

1,292 

3rdCaU 

1,000 

500 

500 

3 

1,722 

861 

431 

861 

4th CaU 

500 

286 

214 

2 

1,032 

344 

172 

516 

5th CaU 

286 

178 

108 

1 

517 

86 

43 

258 

6 th CaU 

178 

119 

59 

0 

172 

*— 

— 

— 






S 7,146 


\ A ADtA 1 nn 


=50% 


?i»=60% 


^£2 -- 4,443 = 8,886 pxif = 1.00 

2 _ 5"* (<rS^ . (tn^ 2pxN(rxo‘if 




(.60) (.60) 
881 


= 1.006% 


I- 

* r(.60)(.60)-l 


2 * 1^ w 

(.60)(.60)n 


}- 


00000186 


00000186 


+.00006997^.00007183 


cr^^^.85%. 


to the extent that such information can be found and utilized the call¬ 
back operations become more effective. 

Other considerations pertinent to a decision regarding the possible 
use of the night-at-home questions may be mentioned briefly, as fol¬ 
lows: (a) the efficiency of the original sample, (b) the number of call¬ 
backs permitted by the budget, (c) the time available in which to pro¬ 
duce the final survey results, (d) the relationship between the variate 
under study and the tendency to be away from home, (e) the time of 
day during which interviews are to be conducted, (f) the particular 
population group under study, e.g., farmers, housewives, car owners, 
men over 21 years old, and so on. It should prove to be especially help¬ 
ful in the further development of this plan to have reports from others 
concerning their experience with it, as well as any suggestions for modi¬ 
fications or improvements in the particular techniques described above. 









APPLICATION OF LEAST SQUARES REGRESSION 
TO RELATIONSHIPS CONTAINING AUTO- 
CORRELATED ERROR TERMS* 

D. COCHKANB Ain> G. H. Oecutt 
Department of Applied Economics, Cambridge 

We point out that autocorrelated error terms require modi¬ 
fication of the usual methods of estimation and prediction; 
and we present evidence showing that the error terms involved 
in most current formulations of economic relations are highly 
positively autocorrelated. In doing this we demonstrate that 
when estimates of autoregressive properties of error terms are 
based on calculated residuals there is a large bias towards 
randomness. We demonstrate how much efficiency may be lost 
by current methods of estimation and prediction; and we give 
a tentative method of procedure for regaining the lost effi¬ 
ciency. 

INTROBTJCnON 

T hree major complications may be distinguished in the statistical 
measurement of relationships between economic time series: 

1. The existence of simultaneous relationships between the vari¬ 
ables. 

2. The presence of auto-correlated error terms. This has been less 
exactly called the time-series complication. 

3. The presence of errors of observation in each of the variables. 
The first complication was forcefully brought to the attention of 
economists by Frisch [1] and Haavelmo [2]; and much work has since 
been done by Koopmans [3] and others, [4] in finding the structural 
parameters when the economic variables are described by a system of 
simultaneous equations. This approach is very promising but the time- 
series complication has been assumed away by the specification that 
the error terms which enter into each equation are independent in suc- 
cesdve periods of time. 

A considerable amount of work has also been devoted to problems 
relating to the second complication. The rather extensive literature 
connected with the variate difference method, conveniently summarized 
by Tintner [5], and also the general analysis of economic time trends 

* We wish to express our thanks for the ocnuiderable aseistance we have received from Richard 
Stone. 


32 




LEAST SQUARES BEOBESSIOK 


33 


may be included under this heading. More directly related to the prob¬ 
lem are the studies which examine the distribution of correlations be¬ 
tween autocorrelated series, [6] the major proportion of which are 
devoted to tests for the null hypothesis. Of those papers which are 
concerned with the measurement of functional relationships between 
series, few make it dear that the significant factor in the analysis is the 
autocorrelation of the error term and not the autocorrelation of the 
time series themsdves. This fact has been well expressed by Aitken, [7] 
but its importance seems to have escaped the attention of economists. 
We diould also refer to a paper by Champemowne [8] which became 
available after this study was essentially complete. Champemowne’s 
paper recovers much of the ground devdoped by Aitken and is an ex¬ 
ceedingly useful study, carrying the problem into the fidd of statistical 
estimation and sampling theory. 

The third complication arises when the assumption that the explana¬ 
tory variables are measured free from error cannot be maintained, and 
may therefore be a problem of some importance when considering eco¬ 
nomic data. In the absence of a complete knowledge of the correlation 
matrix of the errors, rimplifsdng assumptions that the errors in each of 
the variables are random and imcorrelated both with the systematic 
part of each variable and with the errors in the other variables must be 
made. The problems involved have received consideration in the work 
of Frisch [9], Koopmans [10], Tiutner [11], Reiers0l [12] and Geary 
[13]. 

The objects of this paper are four-fold. First, we widi to focus the 
attention of economists on the fact that the presence of autocorrelated 
error terms requires some modification of the usual least squares 
method of estimation; and secondly, we widi to show that there is 
strong evidence in favour of the view that the error terms involved in 
most current formulations of economic relations are highly positivdy 
autocorrelated. In doing this we demonstrate the presence of a large 
bias towards randomness in estimates of the autoregressive properties 
of error terms which are based on calculated residuals. Third we indi¬ 
cate rou^y how much efficiency is lost by current methods of estima¬ 
tion and prediction if error terms are hi^y autocorrelated; and finally 
present a tentative method of procedure. 

In arriving at our conclusions we have placed considerable reliance 
on results obtained from a number of sampling experiments. We recog¬ 
nize that results arrived at by this procedure may not have the elegance 
or all of the utility of results obtained deductively from the same as¬ 
sumptions; nevertheless this method of approadi is a leptimate one 



34 


AMEBICAN STATISnCAIi ASSOCIATIOM- JOTTBNAL, MABCH 1940 


and frequently makes it posnble to obtain useful answers to problems 
wbicb have proved stubborn to mathematical statisticians. In this 
connection it might be noticed that there are a lai^e number of impor¬ 
tant questions in the field of statistics which in principle could be an¬ 
swered deductively but which have till the present time proved too 
difficult. Most of these questions could be answered by sampling experi¬ 
ments and it is to be hoped that, as improved caiculatir^ equipment 
becomes available, more attention will be given to this approach. 

In order to concentrate on the problem of auto-correlated errors, we 
have ignored the difficulties arising from the simultaneous equations 
complication and the errors in the variables complication. However, 
it should be obvious that for the purpose of estimating structural param¬ 
eters it is necessary to find a method of dealing simultaneously with 
all three complications, or at least some indication of their relative im¬ 
portances. A consideration of some aspects of the difficulties to be ex¬ 
perienced in analy^g relation^ps when more than one of these com¬ 
plications are present is contained in a foUowii^ paper [14]. 


BEonnssiOK analysis with autocobbblated ebbob tebms 

OV KNOWN AUTOBEGBESSIVE FBOFEBTIES 

It may be helpful to restate briefly the assumptions underlying the 
method of least squares. Supp<^ a sin^e linear relationship exists 
between the variables Xu, X 2 t,... Xu of the form 

h 

( 2 . 1 ) xu = a + Y, 

where Ut is a random error term with constant variance, while the a 
and the 6’s are constants to be determined. Provided the ... Xkt 
are independent of the random error term Ut, then the best linear un¬ 
biassed estimates of these coefEicients are given by the method of least 
squares, best estimates meaning those estimates which have a TnmiTmmn 
variance. This is true even if the independent variables are autocorre- 
lated, provided we can consider them as fixed in repeated samples [15]. 
If in addition the error term is normally distributed then the least 
squares estimates are maximum likelihood estimates [16]. 

In many economic relationships it is an oversimplification to assume 
that error terms are independent in time. If we have a relationship in 



T.TBAfl T SQTTABSS SEOBESSION 


35 


whidi the error term is autocoirelated, it has been ^own by Aitken [17] 
that the method of least squares still yidds the best linear unbiassed 
estimates of the regresdon coefficients provided the lack of independ¬ 
ence in the error series is taken into account. One method of overcoming 
this lack of independence is to make the error term random by trans¬ 
forming all the variables according to the autoregressive structure of 
the error term. Suppose we have a linear relationdiip given by 

(2 2) yt = oo + am + 

where ut is generated by the Markoff scheme 

(2.3) *= jStt*-! -+• 6i 

with random disturbances «« and a known autoregression coefficient 
We may substitute for Ut in equation (2.2) and obtain 


(2.4) 

yt •= <*0^ + am' + 

where 


(2.6) 

yt - yt- PVt-i and 

(2.6) 

*/ = Xt — PXt-l 


and the application of least squares to equation (2.4) will produce best 
linear xmbiassed estimates of the regression coefficients Oo and ai.^ 

It is also posable to improve on the ordinary methods of prediction 
when the error terms are autocorrelated. If we wish to estimate yt from 
a given xt it can be seen that equation (2.2) is not the most efficient 
form in which to make this estimation. A more appropriate form 
would be to use the relation 

(2.7) yt = oo' + ai(a:t — fixt-i) + Pyt-i 

where oo' and oi are estimated from (2.4). In a later section we diall 
illustrate the gain to be achieved by using this relation in problems of 
estimation. 

In the discussion which follows it is convenient to restrict the mean¬ 
ing of error term to the true series of errors in a relationship, that is the 
series of errors which would be obtained if the true values of the re¬ 
gression parameters were applied in the relationship. To distinguish 
the discrepancies actually obtained from the true errors we shall call 
them residuals. In addition, we shall limit the word disturbance to de¬ 
scribe the random elements in an autoregresdve equation. 


^ A more complete statement of this solution is to be found in Section VI. 




36 


AMEBICAN STATIBTICAli ASSOCIATION JOTJBNAL, MABCH 19« 


AXm>COBBZiI<ATION 07 EBBOB TEBM8 AND BESIDXTAIiS 07 ECONOMIC 
AND CONSTBDCTED BELATIONSEIFS 

In this section we develop the argument that the error terms in 
many if not most current formulations of economic relations are highly 
positively autocorrelated, but it should be stressed that we are not 
tiying to prove that this must be so in every case or that it is impossible 
to formulate relations in which the error terms are random. Since the 
autocorrelation properties of economic time series will frequently arise 
in this section, we riiould first like to refer to a study by Orcutt [18] 
in which it is shown that the fifty-two series used in Tinbergen’s [19] 
model of the economic system of the United States mi^t be considered 
to have been obtained by drawings from a single population of linear 
stochastic series having the same underlying autoregressive structure. 
The underlying autoregressive equation was estimated to be of the form 

(3.1) Xf+i = Xt 0.3(iB» — ajf-i) + etfi 

where the «’s are random disturbances. The hi^ positive autocorrela¬ 
tion of economic time series which (3.1) implies is a feature which should 
not be overlooked. 

Turning to the error terms, let us inv^gate their sources and see if 
there is reason to believe that the error terms also are likely to be highly 
positively autocorrelated. We can gamine their sources under three 
main headings. 

(1.) Systematic errors may arise from a faulty choice of the form of 
relationship assumed to exist between economic variables. Since the 
economic variables are positively autocorrelated, then in general errors 
of this t 3 rpe will be positively autocorrelated. Further the shortness of 
most available time series makes the statistical results meaningless if 
very complicated relationriiips are adopted, so that errors of this type 
are inevitable. 

(2.) Error terms may arise owii^ to the omission of variables, both 
economie and non-economic, from the analysis. Important variables 
may be omitted either because they are not avmlable or because their 
importance is not realized. Furthermore, because of the brevity of 
available time series, it is also frequently necessary to ne^ect variables 
which individually have but a small influence. Nevertheless, it is eAu- 
dent that the total influence of a number of such variables may be 



least SQtTABES BEOBESSION 


37 


very substantial and highly positively autocorrelated.® Now, as al¬ 
ready indicated, there is stroi^ eivdence in favour of bdieving that 
most economic time series are highly positively autocorrelated. There¬ 
fore, in so far as the omitted variables are economic time series, we may 
expect the resulting error terms to be hi^y positively autocorrelated. 

Consider also the case of non-economic variables which are likely to 
influence economic behaviour but which are generally omitted. Some 
of those that more readily come to mind are population and its r^e, 
sex and spatial distribution, changes in cultural patterns, technological 
developments, exploitation and exhaustion of mineral resources includ¬ 
ing changes in soil fertility, and climatic conditions. Most of the above 
series have v^ high positive autocorrelations but even where the auto 
correlations are not high, as in the case of at least certain climatic con¬ 
ditions, it is evident that their impact on the economic system is still 
likely to be autocorrelated. Thus even if rainfall was reaUy a random 
series, the water level in the soil, being the result of rainfall over several 
years, would be positively autocorrelated. We might recall in this re¬ 
spect the correlograms given by Wold [20] of the average srearly rain¬ 
fall dTiring the period 1867 to 1936 of four cities in or near the dnunage 
basin of Lake VSner and the average annual water level (obtained from 
quarterly observations) of Lake VSnerfrom 1867 to 1936. The correlo- 
gram of the yearly rainfall indicated a random series while that of the 
level of the lake indicated a positively autocorrdated series lowing 
that, whilst the occurrence of certain meteorological factors may be 
random, their general influence over time may be systematic. 

Now it may be reasonably argued that the economic behaviour of 
individuals is not completely dep^dent on econonoic variables or 
non-economic variables of the tyx)e we have mentioned, and that, even 
if an explanation incorporated in the correct manner as many as nec¬ 
essary of these variables, it would still not sdeld perfectly correct pre¬ 
dictions.* No doubt this is true, and the explanatory variables needed 


s This may be shown as follows. If we ha-ve two nnrdated autocorrdated Beries Xi and whose first 
autoooirelationa are given by 

cov (a*, at^) , oov (yt, vm.) 

. and- 

var (a) var (y) 

then if xt the first aatocorrdation of S( is &ven by 

cov (a?t, Xj^) + cov (yfe yt-i) 

var (a) + var (y) 

This result may be generalised to show that the sum of any number of autocorrelated series is also 
autocorrdiated with its first autoooirelation equal to the sum of the first lag covariances of the individual 
series divided by the sum of the individual variances. 

*See for example T. Haavelmo, *The Probability Approach in Econometrics,” op. eii. Section 11. 



38 AMKBICAN STATISTICAli ASSOCIATION JOCBNAIi, UABCH 19tf 

to complete the explanation may be of an approximately random char¬ 
acter since they relate to such things as the physiolo^cal processes of 
each individual. However, it woidd be a mistake to infer from this that 
economic time series contain a significant random component, for what 
will obviously happen when the behaviour of a lai^e number of indivi¬ 
duals is averaged is that those actions of mdividuals which are posi¬ 
tively correlated with the actions of others will dominate the average 
while those actions which are random for each individual and uncorre¬ 
lated as between individuals will be averaged out. 

(3.) The series of data used may not measure exactly what is re¬ 
quired for the particular analysis. In so far as the discrepancy is one of 
coverage, it seems reasonable to believe that the error term involved 
will have much the same autoregressive properties as economic series 
in general In so far as the discrepancy is more nearly what might peiv 
haps be called a pure error of observation, it would appear more diffi¬ 
cult to say anything about whether or not it is autocorrelated. How¬ 
ever, on the basis of discussions with economists engaged in the con¬ 
struction of basic economic data, we have formed a very strong impress 
ion that, if an error is committed one year, it will very likely be com¬ 
mitted again the next year and that most errors of observation are 
positively autocorrelated. 

Let us now see whether our theory is plaurible by making a brief 
examination of the autocorrelations of the residuals obtmned in several 
econometric studies. These are two papers by Lawrence R. Klein, 
'The Use of Econometric Models as a Guide to Economic Policy,” 

[21] and 'Economic Fluctuations in the United States 1921-1941” 

[22] ; a paper by M. A. Girshick and Trygve Haavelmo [23] and a paper 
by Richard Stone [24]. The measiue of autocorrelation used is the ratio 
of the mean square successive difference to the variance of the reriduals. 
This ratio is generally denoted by 5^/s^ [25] where and ^ are defined 


by 

1 ^ 

(3.2) 

3* = „ 1 22 (»«-! ®»)*> 

N -1 


1 JL 

(3.3) 

s* = — 22 (®«- 

N M 


1 S. 

where 

« = — 22 

N 


This ratio has been calculated by Eldn for the teEuduals in bis two 



LEAST SQtXABES BEOBESSION 


39 


papers and we have computed the ratios for the residuals in the other 
two papers [26]. Two ratios in each of Klein’s papers have been omitted 
as they refer to first differences of the economic series and are not com¬ 
parable for our purposes. It should be mentioned that the residuals 
given in Klein’s paper in EcmomeMca and the residuals given by Gir- 
shick and Eaavelmo were calculated by the reduced-form method which 
presupposes that it is possible to solve for each cS. a number of jointly 
dependent variables in terms of exogenous variables and random error 
terms and these random error terms are simply linear combinations of 
the error terms given in the original system of equations [27]. The 
residuals obtained from Klein’s mimoegraphed paper and from Stone’s 
paper were calculated by ordinary least squares method of regresrion. 
The total number of series considered is 43 and Table 1 shows them 
classified according to source and number of parameters used in each 
equation. The individual values of are illustrated on the scatter 
diagrams of Figures I-IV. 

TABI^i I 


8 TTMMABV OF VALUES OF <>/<> OBTAINED FOB VABIOUS BESIDUALS 


Source of residuals 

Number 

of 

years 

Number of parameters 

Total 

3 

4 

6 

6 

Klein—Econometzica 

22 

2 

7 

2 

1 

12 

Klein—^Mimeographed study 

20 

1 

7 

1 

— 

9 

Girshick and Haavelmo 

20 

2 

2 

1 

— 

6 

Stone 

19 

4 

6 

6 

1 

17 

Total 


9 

22 

10 

2 

43 

<1.24) «0.025 


7 

6 

mm 

— 

16 

<1.37) *0.06 


8 

10 

■■ 

— 

22 


The probability distribution of 8*/s* for a random series has been 
tabulated [28] for various N, where N is the number of items. This dis¬ 
tribution is symmetrical aroimd 2N/N—1 so that for N—7f) the ex¬ 
pected value of for a random series is 2.11. This is the horizontal 
dotted line shown on the diagrams. In view of the high positive auto¬ 
correlation of economic time series and the reasons given for expecting 
error terms to be autocorrelated, there seems little chance of obtaining 
a value of aroimd the upper tail of the distribution and, since we 
wish to minimize the risk of failing to reject a value of S®/s* as coming 
from a random population, the appropriate test would seem to involve 
the use of the value of 8*/a® corresponding to the 5 per emit s^nificant 
level, from the lower tail only. Since all our series are of approximately 

































LEAST SQTTABES BEaBESSION 4X 

ESTIMATED BY VABIOUS STATISTICIANS 







42 


AMBBICAN STATISTICAL ASSOCIATIOK TOITBNAL, IIABCH 1M9 


the same length, this value is 1.37 for JV=20. The value of iV**( “ 1-24) 
corresponding to the 6 per cent significance level which includes both 
tails has also been added.* Out of the 43 series, 16 are significantly 
different from a random series at the 2| per cent level, while 22 are 
fflgnificant at the 5 per cent level These results indicate that in many 
cases the assumption of random error terms is not a very good approxi¬ 
mation to the truth. 

The dopii^ lines on Figures 1-TV correspond to the average of 
twenty estimates of obtained from constructed relation^ps, de¬ 
scribed in subsequent paragraphs, in whicfii the error terms were first 
Bummations of random series. It would seem more reasonable to con¬ 
sider that the values of are distributed around a line of this nature 

rather than around the horizontal random line. This suggestion is sup¬ 
ported by the decreasing proportion of retiduals which are significantly 
different from random series as the number of parameters in the rela¬ 
tionship increases. From Table I it can be seen that the proportions 
which are s^nificantly different from random are 8/9,10/22, 4/10 and 
0/2 for 3, 4, 5 and 6 parameters respectively. 

ConsinieHon of an experimental model. The examination of the retid- 
uals obtidned from actual economic relationships fails to reject the 
hypothesis that error terms are highly positively autocorrelated in a 
number of economic relationships. little is known about the behaviour 
of relationships possesting autocorrelated error terms, so it was de¬ 
cided to construct several relationships of this type from artificial series 
and observe the results of applying least squares r^ression. The general 
form of the relationtiiip adopted was— 

(3.4) Xi = 4-1- -f* bit.a t n 

where Xi, Xs and u were independently constructed series all possessing 
the same autoregressive structure, t represented a linear time trend and 
the true values of the constants were 4=0, &i 2 . 3 t= 2 , {)u. 2 (=l and 
bius=0. Thus the actual equation used for the construction was— 

(3.5) Xi = 2Xi "t" X» -b u. 

Five sets of relationships of this form were constructed with different 
autoregrestive structures, each set containing 20 equations. The series 
used were generated according to the following fonnulae:— 

4 Kle an has taken the 5 per cent levd of agnificance to indnde both tails of the distribution (JSconr 
omebricat op. eif., p. 114). 




IiEAST SQTTABES BEGBESSION 


43 


A. Xn-i = ®i + 0.3(x( — Xt-i) + e^-i 

B. *<+1 = ®( + et+i 

(3.6) C. ajj+i = 0.3x( + ««+i 

D. Xt^i = et+i 

E. Xt+l — €<+.1 — «* 

where the e’s denote series of random disturbances. Instead of stating 
the precise form of the autoregressive equation each time a series is 
referred to we shall use the letters A, B, C, D and E as a, convenient 
notation. 

The random elements were obtained from Tables of Eandom Sample 
ing Nmnbers [29]. Two figure numbers were extracted, ignoring the 
number 00, so that they ranged from 1 to 99. The number 50 was then 
subtracted throuj^out so that we possessed a rectangular distribution 
ranging from +49 to —49 with a true mean of zero. We then formed 
60 independent series of these random elements, each one 20 items in 
length, omitting a few niunbers between each series so that we could 
later extrapolate for forecasting. The application of these series in> 
groups of three to the relation (3.5) gave us the 20 equations of set D. 
The other transformations were then formed from this basic set. For 
example, the set of first summations, series B, was formed by making 
the first-term of each series zero and summing progresavely over each 
item of the random set. Simplifications of the calculations involved 
were made by using the fact that C is the first difference of A, while B, 
D and E are respectively the first summation of a random series, a 
random series, and the first difference of a random series. It can be 
easily seen that there were 21 items in series A and .B, 20 in £7 and D 
and 19 in E. They are therefore analogous in length to most available 
economic time series. 

In each set a regression analsrsis was carried out with one explana¬ 
tory variable (in this case the error term became (Xs+u) ), in several 
of the sets the analsrsis was extended to two explanatory variables and 
in the case of set B to three explanatory variables. In addition, the sta¬ 
tistic was calculated for the actuid error terms and for the redd- 
uals. A complete summary of these calculations is contained in Table 

n. 

Bias introditced in estimating the avioeorrelations of residwAs. Given 
a set of equations in which the explanatory variables and the error 
terms possess the same autoregressive structure, can we say ansrthmg 



44 


AMEBICAN STATISTICAli ASSOCIATION JOITBNAIij- MARCH 1S4» 


about the way iu which the autocorrelations of the redduals vary as 
the number of explanatory variables is increased? Figure V presents 
this information with each set labelled according to its autoregressive 
structure. The number of parameters includes the constant term so that 
we have one parameter when only the mean is estimated. Strai^t lines 
have been fitted visually to the points for each set using as additional 
points, except for D, the true values of which are zero for A and 
B, 1.4 for C and 3.0 for E. The sets A, B and C show a marked bias 
upwards as the number of parameters is increased. It is not expected 
that this lineality would continue indefinitely but would flatten out 
as more than four parameters are used and approach nearer and 
nearer to the value of expected for a random series. The random 
set D merely shows a distribution around the horizontal straight line 
and when we pass to the series of first differences of random numbers E 
there is only very slight evidence of a downward movement in the val¬ 
ues of with increasing parameters.* 

Another w'ay of illustrating the bias in the estimated autocorrela¬ 
tions of the redduals as more variables are introduced is to apply our 
previous test of mgnificance to the individual values of $*/«* obtained 
in set B. This has been done in Table III. As the number of parameters 
increases the proportion of residuals which yield a vdue of S*/«* sig¬ 
nificantly different from that expected for a random series at the 5 per 
cent level grows smaller; from 19/20 when only the mean is estimated 
to only 10/20 when four parameters are used. This is a similar result 
to that found for the residuals of actual economic relationships. 


tabus ni 

SIONIFICANCB TESTS APPLIED TO RESIDUALS OF SET B 


Explanation 

Number of 
Parameters 

Number different from random 
at significance levels of 

Total number 
os residuals 

2| per cent 

5 per cent 

Actual error term / 

■HnHII 

19 


20 

1 


19 


20 

One e:q>Ianatory variable 


17 


20 

One explanatory variable+time 


13 

15 

20 

Two eaqplanatoiy variables 


11 

U 

20 

Two explanatory variables+tixne 

IHHI 

Hlfli 

10 

20 


The amount of variance to be explained in an economic time series 
can be regarded as composed of two parts, the first due to the smooth 
movements of the autoregresmve structure of the series and the second 


s The fizBt autoooiTdation of ihe first dififerences of a random series is *^0.5 or 















LBABT BQTTABBS BEOBBS8ION 


45 


due to the random disturbances. What is important for a real explana¬ 
tion is that a proportion of the variance due to the distiurbances ^ould 
be explained as well as that due to the general movement of the series. 
Now quite high correlations between autocorrelated series may be 
obtained purely by chance* and when this happens what is largdy 
explained is the variance due to the regular movements through 
time. The residuals of such a relation^p will be essentially the year- 
to-year fluctuations and of a more random character than the original 
series. This can be illustrated by comparing the two cases in set in 
which two explanatory variables are used, one of which includes a 
linear time trend and the other two real variables. From equations 
(4) and (5) in Table II it can be seen that, while the inclusion of time 
adds an amount of 0.026 less to the explanation of the variance of the 
dependent variable than the inclusion of the second explanatory vari¬ 
able, the average value of 5*/^ lor the residuals is 0.023 greater. These 
are the two points whi(dx are dose together in Figure Y for three 
parameters and it can be seen that the addition of a linear time trend 
in the explanation produced approxunately the same bias as the in- 
dusion of a real explanatory variable. This is confirmed by the average 
value of obtained when Xi, Xt and t are the explanatory vari¬ 
ables. 

Since the indudon of the bogus variable time had about the same 
effect in biasing the residuals towards randomness as the inclusion of 
real explanatory variables, we were curious about the effect of including 
other types of non-rdated series in the explanation. We therefore cor¬ 
related two unrelated series, Xt and Xt, of set B. With Xt as the depend¬ 
ent variable it was found Hiat the average amount of the variance ex- 
plmned was 0.32, while the mean value of 5*/$* for the residuals was 
0.74. This latter value is sli^tly hi^er than that obtained for equation 
(3) of Table II where the average explained variance is 0.64 with a mean 
value of ^/s*»0.69 for the autocorrelation of the redduals. This 
su^ests that if error terms are autocorrelated then it would fre¬ 
quently be a mistake to attempt to justify the statistical requirements of 
randomness by adding more explanatory variables or by experimenting 
with different combinations of the variables. Owing to the shortness 
of economic time series, high acddental correlations may be obtained 
between the variables added and the error term due to their autore¬ 
gressive structures and since the redduals obtained from the least 
squares method of regression are orthogonal to the explanatory vari¬ 
ables they wiU tend to be biassed towards a random series. 

* See G. TJ. Yule op eii. and Orcatt and James op. eU. 




AMEBICAN STATISTICAL ASSOCIATION JOCBNAL, UABCH 1949 


1 

( 

^|H 

1 

0.817 

0.817 

0.817 

0.817 

0.018 

0.013 

0.408 

0.817 

0.817 

0.018 

0.408 

0.817 

0.013 

0.408 

CSaloulated 

Value 

i 1 
>« 

0.067 

0.006 

0.040 

0.004 

0.012 

0.0006 

0.181 

0.005 

0.0066 

0.0004 

0.044 

0.0070 

(0.008) 

0.001 

(0.001) 

0.062 


J 

Si § ii§ if if s 

de* deddd d ddd d© 03 0 

RegresBion Parameters 

1 


1 



1 


s 



*S “ 

1 

Msmumiamii 

S - 

Vari¬ 

ance 

0.027 

0.766 

0.765 

0.617 

0.336 

0.220 

0.176 

0.111 

0.076 

0.127 

(0.128) 

0.088 

(0.088) 

•o 

1 


|1 

Stand¬ 

ard 

devia- 

Ucm 


1 

ii iiss^ I ^ 

die c^ddd^ r! ^ ^ ^ sri 

1 1 1 7 1 i 

% 

% 

•s 

1 

II 


11 

0.165 

0.168 

0.279 

0.283 

0.320 

0.811 

0.268 

0.264 

0.171 

0.147 

0.064 



0.400 

0.700 

0.686 

1.081 

1.058 

1.806 

N.O. 

1.650 

2.007 

2.163 

N.C. 

3.011 

2.034 

N.C. 

Actual error 
series 


11 

0.128 

0.128 

0.223 

0.223 

0.071 

0.071 

0.260 

0.264 

0.228 

0.115 

0.070 


j 

0.310 

0.310 

0.460 

0.460 

0.800 

0.309 

N.O. 

1.404 

1.982 

2.138 

N.a 

2.006 

3.047 

N.O. 

Generating 

•8 

1 

i 



1 

Explan- 

atcny 

Vari¬ 

able 

BiiBiiiii 


|l4 

1 

2 

8 

4 

6 

6 

7 

8 

9 

10 

11 

12 

18 

14 


Ffguree in parentheses vere calculated aasunung a mean of sere* 





















































48 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1»49 



PABAUZTSBS 

FIGUKS y. AUTOCORRELATION OF RESIDUALS OBTAINED FROM 
CONSTRUCTED RELATIONSHIPS 


ESTIMATION OP REGRESSION COEFFICIENTS AND PREDICTION 
BT LEAST SQUARES FOR RELATIONSHIPS CONTAINING 
AUTOCORRELATED ERROR TERMS 

Our objectives in this section are to show that the usual application 
of the method of least squares to relation^ps contaming highly 
positively autocorrelated error terms results in an extremely inef¬ 
ficient use of data and that it is only necessary to apply a transforma¬ 
tion which will make the error term approximately random in order to 
regain most of this efficiency. 

The complete information is contained in Table 11 but in order to 
illustrate the position more clearly we have set out some of the more 
relevant calculations in Tables lY and Y. 





liEAST SQtTAEBS REGRESSION 


49 


TABLE IV 


VARIANCES OR REGRESSION PARAMETERS UNDER DIRRERENT 
TRANSRORHATIONS USINO ONE EXPLANATORY VARIABLE 


Generating properties of 

Values of for 

Variance of 

Explanatory 

variable 

Error term 

Error term 

Residuals 

Correlation 

coefficient 

Regression 

coefficient 

A 

A 

0.31 

0.49 

mSSM 


B 

B 

0.45 

0.69 

BSIH 


C 

C 

1.49 

1.56 



D 

D 

1.98 

2.00 

0.0056 


E 

E 

8.00 

8.01 

0.008 



The decline in the variances of both the correlation coefficient and 
the regresrion coefficient as the error term becomes random is very 
marked. In the case of one explanatory variable the variance of the 
correlation coefficient when the error term is of form A is approxi¬ 
mately 11 times the variance when the error term is random, while 
the ratio of the corresponding variances of the regression coefficient is 
approximately 9 to 1. As we introduce more determining variables into 
the explanation, we can see from Table Y that the variances of the 
regresrion coefficients decrease until in the limiting case all the varia¬ 
tion in the variable to be determined is explained and there is a com¬ 
plete set. This limiting case is of course very rarely approached in 
practice and if we consider the set B, where for three explanatory vari¬ 
ables the mean multiple correlation coefficient is as hi^ as 0.97 (see 
Table II, equation 6), we find the variances of the regression co¬ 
efficients are 0.22 and 0.16 for ba and bn respectively, which from Table 
V can be seen to be three times the variances of the regresrion co¬ 
efficients calculated in the random transformation even though the 
mean multiple correlation coefficient in this form is only 0.93. 


TABLE V 

VARIANCES OR REGRESSION PARAMETERS UNDER DIRRERENT 
TRANSFORMATIONS USINO TWO EXPLANATORY VARIABLES 


Generating properties 

Values of 5*/i* 

Variance of 

Explanatory 

variable 

Error 

term 

1 Error 
terms 

Residuals 

Multiple 

oorrtiation 

coefficient 

Regression 

coeffidents 

&1I 

hu 

B 

B 

0.31 

1.06 

0.012 

0.34 

0.48 

D 

D 

2.14 

2.15 

0.0004 

0.08 

0.05 

E 

£ 

3.05 

2.93 

0.001 

0.09 

0.10 






































50 


AMBBICAN STATISTICAL ASSOCIATION JOURNAL, UASCH 194S 


In Table lY we can see that fluctuations in the variances of the 
regression parameters are very small for reasonably large movements 
of around the random value, given by the results for C, D and E. 
The true values of the autocorrelation coefficients of the error terms 
vary from ri=0.3 to ri=—0.6 in these cases. This relative stability 
of the variances indicates that a transformation which makes the 
error term approximately random will have regaiaed most of the 
improvement in the efficiency posrible. Similar results would also 
appear to be true for the case of two e^lanatory variables. 

In our modd there is no real trend, yet the introduction of a linear 
trend to sets A and B improves their explanation and reduces the 
variance of the regression coefficients. This would seem to be due to 
the fact already considered that the trend factor reduces the amount 
of autocorrelation in the residuals and can be regarded as one method 
of transforming the error term. In these circumstances the introduction 
of a polynomial trend may be a useful device in obtaining more 
accurate results, but it is difficult to attach an economic meaning to 
the coefficients of time. 

In order to obtain some idea of the accuracy of estimation of re¬ 
gression parameters imder other possible t3rpes of relationships and to 
illustrate once more the importance of having the error term random, 
we constructed from the series already calculated two sets of relation- 
ridps in which the autoregresrive structure of the explanatory vari¬ 
able and the error term were different. The form of the lelationriiip 
was— 

(4.1) El = & -|- -{- V 

where Xt was of form A in both sets and v adopted first form B and 
second form D. The true values of the constants were k=0 and 
5ia=2 while the error term was taken from our previous sets with 
v=Xz+u. The first differences of each set were calculated and then a 
further correction was made to randomize the explanatory variable. 
This latter process produced error terms generated by the following 
formulae— 

F. Xn.1 — e«+i — 0.3e« 

0. Xt+i — (6t+.i — «i) — 0.3(e* — «t-i) 

where the e’s denote a series of random disturbances. The results of 
the calculations are set out in Table VI and the values of 5*/s* provide 
additional points for Figure Y. In each set it can be seen that a consid¬ 
erable ^dn is to be obtained in the efficiency of the estimates of the 



least 8Qt7AIlES BEGBESSION 


51 


correlation coefficients and regression coefficients when the error term 
is random. If error terms are really random as postulated by many 
economists, there is nothing to be gained from making any transforma¬ 
tion, even thougb the original series possess high positive autocorrela^ 
tion. It can also be seen from the mean values of the regression coef¬ 
ficients of Tables II and VI that the least squa>res estimates are not 
biassed when the error term is autocorrelated even though they are not 
the best estimates. 

Tests of Significance. It is well recognized that the ordinary test of 
significance for the null hs^pothesis can be applied to the correlation 
between two series provided one of them is random.^ This can be seen 
to be equivalent to making the error term random in the spedal case 
of a zero regression coefficient. To apply confidence limits it is neces¬ 
sary that the dependent variable is distributed normally and randomly 
around a linear function of the explanatory variable. This is true 
even if the explanatory variable is not random.* If economic time series 
possess the properties which we are suggesting, then the transformation 
to make the error terms random will put them in a form in which it 
will be possible to apply confidence limits and test the significance of 
regression parameters in the ordinary way. 

Prediction. Prediction is one of the primary reasons for undertaking 
statistical analysis. In Table YII we present some material derived from 
our constructed relations which emphasizes the huge improvement that 
it is possible to make if one is dealing with a formulation involving 
error terms which are a first suirunation of random dements. This 
table also indicates how misleading the variance of the residuals may 
be in such a case. 

The fact that the items in colunm IV are smaller than those in 
column V is, of course, to be expected, rince the regression parameters 
have been chosen to minimize the mean square of the residuals and the 
true errors are those obtained by use of the true values of the regression 
parameters. In the cases of random error terms, rows 2 and 5, this 
downward bias is am nil and could, if desired, be easily compensated 
by taking account of the number of parameters fitted. In the cases 
of error terms which are the first summation of random numbers, the 
downward bias is exceedingly laige for series of this length and diould 

* See M. S. Bartlett, "Some Aspects of the Tizns-Ck>xxdatioai Problem in regard to Teste of Sig¬ 
nificance,” Journal of the Roy<d StatiOical Socielyt YoL 98,1936, pp. 636-4143. 

* See R. A. Fisher, ‘‘The Goodness of lit of Regression Formulae and the Distribution of Regres¬ 
sion CoeffioientB,” Journal of the Royal SUOUHeal Society, VoU 85,1922, pp. 697-612, and H. Onmer. 
‘Mathematical Methods of Statistics,” op. eit., pp. 646-666. 




52 


AMBBICAN STATISTICAL ASSOCIATION JOTJENAL, SIaBCH 1949 


emphasize the caution needed in interpreting standard errors of es¬ 
timate if the error terms are likely to be highly positively autocorre- 
lated. Column VI gives the variance of the errors of prediction one 
item beyond the parts of the series utilized for estimating the regres- 
fflon parameters. That is, each of the series involved m eadi set of twenty 
equations was e3[tended one item and the dependent variable then 
predicted with a knowledge of the regresdon coefficients previously 
calculated. Column VI again illustrates in a rather simple way how mis¬ 
leading the variance of redduab may be when the error terms are auto- 
correlated, as in rows 1,3 and 4. It ^ould of course be realized that the 
mud) smaller variances obtained in rows 2 and 5 are due both to the 
fact that better estimates of the regression parameters have been 
obtained and used in these cases and also that the prediction formula 
makes use of the fact that the errors involved in rows 1, 3 and 4 are 
the first summation of random numbers. Thus, whereas in row 1 the 
estimating formula was 

(4.3) -STi.a+i = oi + buXs.it+i, 
in row 2, the estimating formula was 

(4.4) = Oi' -h 6 ij^(X2.»4.i — Xjn) -(- Xu’ 

The errors involved in the prediction formula (4.4) are therefore ran¬ 
dom in time whereas those in (4.3) are first summations of random 
terms. 

TABLE VII 

A COMFABISON 0? THE VABIANCES OF BESmUALS, TBUB BBBOBS AND 
BBEDICnONS OBTAINED FBOM SEVEBAL TBANSFOBMATIONS OF 
THE CONSTBUCXED BELATIONS 


No. 

Generating properties of 

Number of 
explanatory 
vanablee 

III 

Mean 
vanance 
of residuals 

IV 

Mean 

vanance 

of true 

errors 

V 

Variance of 
errors of 
predictions 
one item 
beyond 
sample 

VI 

Ehrplanatory 

variable 

I 

Error 

term 

n 

1 

B 

B 

1 

5142 


7479 

2 

D 

D 

1 

1375 

■gmn 

933 

3 

B 

B 

2‘f-time 

784 

4386 

7127 

4 

B 

B 

2 

1690 

4386 

3991 

5 

D 

D 

2 

634 

749 

774 




















LEAST SQUARES REGRESSION 


53 


A TENTATIVE METHOD OR PROCEDURE 

Having recognized that the error terms implicit in many current 
formulations of economic relations are highly poatively autocorre- 
lated, and also having recognized the importance of carrying out 
estimation and prediction by means of relations involving random 
error terms, how shall we proceed when faced with a practical situa¬ 
tion? One way of evading this problem would be to change some of the 
variables, add additional variables, or modify the form of the relation 
until a relationship involving what appear to be random error terms 
is found. However, while this may possibly be a satisfactory way out in 
some cases, it obviously does not help much if by some means or other 
one has arrived at what is believed to be the most reasonable choice 
of variables and form of relation. This choice of variables and form 
of relation usually does not involve any specification of whether or not 
the errors are autocorrelated and what is required is the best method 
of estimating the parameters and various standard errors of the 
chosen relation, and not some other relation. In this situation the 
objective, of course, is to make an autoregressive transformation of the 
dependent and independent variables such that the error term becomes 
random. If the autoregressive properties of the error term were known, 
then it would Edmply be a matter of making the indicated autoregres¬ 
sive transformation as illustrated in section 2. The real problem arises 
when the autoregressive properties of the error term are not known 
but must be estimated. Except for the fact, which our experiments 
demonstrate, that nearly optimum results can be achieved if the error 
term is only a rough approximation to a random series, solution of the 
problem would seem rather hopeless for series of only twenty items. 

One fairly obvious procedure, which we are inclined to rule out be¬ 
cause of the large biases demonstrated in section 3, would be the 
following iterative process. First estimate the desired regression co- 
e£5lcients by ordinary least squares and obtain the resultii^ series of 
residuals. Then estimate from those residuals by least squares the auto¬ 
regressive parameters of a one or two lag difference equation. Use these 
autoregresdve parameters to make an autoregressive transformation 
of the observed series aimed at randomiziog the error term, and re- 
estimate the desired regression coefficients. Put these revised estimates 
back in the original equation, obtain the resulting series of residuals and 
estimate thdr autoregressive parameters. Use these to make a new 
autoregressive transformation of the ori^nal series and so on until 
estimates of the desired legresaon coefficients are obtained which 



54 AMTliUT fiATf STATISTICili ASSOCIATION JTOITBNAIi, HARCH I94» 

are conastent with estimates of the autoregressive parameters of 
the residuals in the sense that no further adjustments are necessary. 
Since it is only necessary to make the error term approximately random 
it is unlikely that much would be gmned by carrying the above process 
more than one or two roimds. The real difficulty with this procedure 
is that the series of readuaJs will, as diown in section 3, be strongly 
biassed towards randomness and therefore the autoregressive trans¬ 
formation based in the above way on the residuals may not in fact go 
far enough in randomiziag the error term. 

An alternative procedure which appears more promiang to us is 
that of selecting an autoregresave transformation of the series involved 
such that the autocorrelations of the series of readuals are approxi¬ 
mately equal to the expected values of autocorrelations of random 
series of the same length. We have not worked out an efficient pro¬ 
cedure for doing this; but, if one is willing to approximate the auto¬ 
regressive properties of the error term by a one or even two lag linear 
difference equation, it is fairly easy after one or two trials to choose an 
autoregressive transformation which will result in residuals that are 
sufficiently random. Furthermore, if our evidence that many error 
terms appear to be approximately first summations of random term is 
accepted, then the obvious procedure is to work with first differences of 
the series used. Thus, given a relation between ordinary economic 
variables 

(5.1) Xw == oi + buXst + 

we surest as a first approximation estimation and prediction in the 
form 

(5.2) (Xu - Xi.«) = 5u(X„ - -I- 5i,(X« - X,.w). 

If (5.1) had contained a linear trend then (5.2) would have contained a 
constant term. The residuals from (5.2) can be obtained and tested 
for randomness. 

If we prove to be right about the nature of most error terms in 
current formulations of economic relations, then the residuals of the 
first difference transformation will turn out to be sufficiently random 
and no further steps will be necessary. If the residuals in this form do 
not turn out to be sufficiently random, then a new transformation can 
be devised on the basis of their autocorrelations. The main advan¬ 
tages of this procedure are, first, that in many cases it will result im¬ 
mediately in the correct transformation and, secondly, that when it 
does not it will usually result in readuals that are not highly poatively 



t.tbah t bqttabes beobbssion 


55 


autocorrelated and thereby reduce the amount of bias towards random¬ 
ness which is present in this case. This will be a help in devidng 
successive autoregressive transformations. 

On the basis of this study Richard Stone* has recalculated a number 
of demand studies for the United Kingdom 1920-38. The general 
results will be published by Stone, but he has kindly made available 
to us the material presented in Table YIII. We present this material 
as further evidence that in many cases the use of first differences does 
result in essentially random series. It also seems reassuring, in so far as 
Stone’s work is concerned, and rather remarkable, that in most cases 
the multiple correlations for the relations in first difference form re- 
nuuned very hi^. 

TABUS VUI 


VAUUSS OF <>/«> FOB A NUMBEB OF DEMAND STUDIES FOB THE 
UNITED KINGDOM 1S20-38 


Commodity 

Number of 
parameters 

Values of 5*/** for 
residuals 

Adjusted multiple 
correlation coefficient 

Original 

data 

Eirst 

differences 

Original 

data 

First 

differences 

Beer 

3 

1.28 

1.86 

0.989 

0.962 


4 

1.13\ 

2.01 

0.9891 

0.977 


4+time 

1.23J 


0.993/ 


Sinrita 

3+time 

1.26 

2.63 

0.992 

0.875 

Tdlegrams 

3 

1.24 

1.61 

0.985 

0.967 


4+tinie 

1.10 

1.65 

0.987 

0.966 

Imported wine 

4 

1.49 

1.84 

0.893 

0.754 

Communication services 

3+time 

0.71 

2.05 

0.996 

0.834 


4+time 

0.70 

2.11 

0.996 

0.822 

Laid 

3+time 

0.90 

2.06 

0.838 

0.864 

Margarine 

4 

1.261 

1.80 

0.9591 

0.748 


4+time 

2,02J 


0.969/ 



3+time 

2.31 

2.31 

0.976 

0.756 

Mean value of 


1.28 

1.99 




APPENDIX TO SECTION H 

It is of mterest to compare the simple solution presented in section 
II with the general solution given by Aitken [30]. We shall not repeat 

* Thjese studies were origizially given in his paper on 'Analysis of Demand,” op. cit., but the reoal- 
culations were made on the basis of revised estimates of the data. 





















56 


AMEEICAN STATISTICAL ASSOCIATION JOUENAL, MAECH 1949 


his elegant and ligorous proofs but shall merely illustrate his approach 
and deduce the special case where the error series follows a simple 
Markoff scheme. For this it is necessary to follow his generality of no¬ 
tation and employ matrices and vectors, using P' and yi to denote the 
matrix or vector obtained by transposing P at y and as the 
inverse matrix of P. 

Consider first the ample case of least squares with non-autocorre- 
lated errors. Let the approximate representation of the column vector 
of data 

(6.1) y = (yiyj •••».} 
by the column vector 

(6.2) 2 = {2i2i • • • s»J 

be linear in terms of a set of (fc+1) prescribed functions 

(6.3) 1, XU) Xu, ’ •' , Xkt (< = 1, • • • n). 

Let P denote the matrix of these fimctional values so that the tth 
row of P is the row vector 

(6.4) [l, Xu, Xu ’ ’' ajwj. 

Then P is of order nX(k+l) and with the restriction of linear inde¬ 
pendence over the n vidu^ x,i, • • • , Xin, it is of rank (k+1). Let a 
denote a column vector of (k+1) coefficients 

(6.5) o = {atfliOt • • • Oi}. 

Then the set of values Zi is the vector 

(6.6) 2 = Pa. 

If the data y are independent then the principle of least squares mini¬ 
mizes the sum of the squared teaduals. This is the vector product 

(6.7) 8* = (y - PaYiy - Pa) 

and for the minimal conditions d^/da=0 we obtain the set of normal 
equations 

(6.8) P'Pa = P'y. 

Having establi^ed this general result for least squares, Aitken ex¬ 
tends the argument to the case of autoconelated errors. If the set of 
errors be arranged according to their variances and covariances by the 
elements of a synometric matrix U of order nXn, then the least squares 



LBAST SQTTABES BEOBESSION 


67 


estinmtes are obtained by minimizing 

(6.9) (y - PayU-^{ 3 i - Pa). 

Difierentiatii^ in the manner above, we obtain the set of more general 
normal equations 

(6.10) P'U-^Pa = P'U-% 

Let us now apply these general results to a simple specific example. 
Suppose we have a linear relation 

(6.11) 11 = «o + oriXi + tt* (f = 1 • • • ») 

where Ut is defined by the simple Markoff process 

(6.12) Ut - + €, 08 < 1) 

where is a known constant of e* a random disturbance. Our vari¬ 
ance, covariance matrix of error may be defined by the symmetric 
matrix of order n Xn where we have assumed unit variance of e* for 
simplicity, although the final result would not be altered if we did not. 





1 

P l8* 

. 




1 

P 

1 P 

. /jn-l 


(6.13) 

U = - 
1 

X 

/s* 

P 1 

. pnr-i 





iS* 

P^lp>lr-i 

• 1 . 


from which we obtdn 

the symmetric inverse matrix 



1 

-/3 


0 


0 


-/s 

1 +/S* 

-P 

•• 

0 



-P 


1+P^ 


0 

(6.14) 

TT 

1 • 







1 




.1 + /S* 

-p 


10 

0 


0 

•-P 



The matrix P is of order n X2 where the *th row is 
(6.16) [1 Xi] 

while the vector of coefficients becomes the column vector 

,'ao’ 


(6.16) 



58 


AlOmCAN STATISTICAIj ASSOCIAKtON JOVSSMj, MABCH 1M9 


Appl^oog these components to the general normal equations (6.10) 
and expanding we obtain the estimate of as 

n n n n 

£ Xtift - /s £ - /s£ Xt-m + /S*£ xt-jyt-\ 

(6.17) Ai = — -;;-^-;;- - -- - - 

£ - 2/3 £ xat-1 + i3*£ xt-i* 

Is t 

where Xt, yt are in terms of deviations from thdr means which are 
given by 

(6.18) -- ^ - -( £ X. - /s£ x). 

These are completely general results for error terms of the ^mple 
type considered and do not involve any assumptions about the dis¬ 
tribution of the random disturbances et. If et are normally distributed 
then we have a maximum likelihood solution. 

Comparing the estimate (6.17) with that obtained by our mocMed 
transformation procedure of section 11 we have from (6.11) and (6.12) 

(6.19) Yt — PYt^ — a'o + ai(xi — PXt-i) + «i 
where the least squares estimate of cei is 


( 6 . 20 ) §1 = 


£ x^t - /S £ x^t-i - j8£ xt-jyt + /3*£ 

S S S 2 

£ Xt* — 2/3 £ XtXt-i + ^* £ ®*-i* 
2 2 2 


where the means are calculated by 
1 


( 6 . 21 ) 


n — 1 2 


£Xt and 


--£Xt_i. 

n — 1 2 


If we represent the numerator and denominator of (6.20) by A and B 
respectively we obtain 

( 6 . 22 ) ^2 = ^ 


so that the estimator given by (6.17) is 


A + xiyi(l — P*) 
B -b xi*(l - /3*) 


(6.23) 



LBAST SQUARES BEOBESSION 


59 


The reason for this difference is that ai ignores the possibility of 
rnalfing use of the first error term Ui, and estimates the r^ression co¬ 
efficients using only (n—1) transformed terms. The sum of squares 
of the (»—1) terms is 

(6.24) *** ~ 2 (“* ~ 

t t 

The first term may be introduced by using the fact that the expected 
value of 6i* given ui is 

(6.25) S(€i») = (1 - /S*)«i* 
so that 

(6.26) 8*=i: €,*=i: («, - +d - j8*)«i*. 

1 2 

If we substitute for the u’s in terms of x and y from (6.11) and mini¬ 
mize in the ordinary way with respect to ao and ai, we again obtain 
the solutions (6.17) and (6.18). It can be seen therefore that Si is an 
unbiased estimate of ai but by ignoring the first term a maximum of 
one degree of freedom is lost in the transformation procedure as /9 
approaches zero. As P approaches unity the difference between di and 
«i approaches zero and when jS=l the solutions (6.17) and (6.20) are 
identical and the obvious course is to make a first difference transfor¬ 
mation. 

In the case of multivariate regression the procedure of transforming 
the variables and applying ordinary least squares analysis provides a 
much simpler solution than the method indicated by (6.17). The trans¬ 
formation procedure also provides a simpler solution in the case 
where the autoregressive structure of the error term comprises a linear 
stochastic difference equation involving two or more lagged terms. 

REFEBENCES 

[1] R. Erisch, Statiaticdl Ctmflvmce Analysis by means of Complete Regression 
SysteTnSf Oslo 1934. 

[2] T. Haavelmo, “The Probability Approach in Econometrics,’ Econometrica, 
Vol. 12 Supplement, July 1944, and “The Statistical Implications of a Sys¬ 
tem of Simultaneous Equations,’ Econometrica, Vol. 11, 1943, pp. 1-12. 

[3] T. Koopmans, “Statistical Estimation of Simultaneous Economic Rela¬ 
tions,’ Journal of the American Statistical Association^ VoL 40, 1945, pp. 
44S-466, and “Statistical Methods of Measuring Economic Relationships,’ 
Cowles Commission Discussion Papers^ Statistics No. 310. (Mimeographed 
copy of lectures delivered at the University of Chicago 1947.) 



60 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

[4] See for example M. A. Glrshick and T. Haavelmo, “Statistical Analysis of 
the Demand for Food: Examples of Simultaneous Estimation of Structural 
Equations,” Econometricaf Vol. 15, 1947, pp. 79-110; J. Marshak and 
W. H. Andrews, “Eandom Simultaneous Equations and the Theory of Pro¬ 
duction,” Econometriea, Vol. 12, pp. 143-205; for a mathematical treatment, 
T. W. Anderson and H. Rubin, “Estimation of the Parameters of a Sm^le 
Stochastic Difference Equation in a Complete System,” to be published in 
Annals of Mathematical Statistics, 

[5] G. Tintner, “The Variate Difference Method,” Cowles Commission Mono¬ 
graph No. 5, 1940. 

[6] G. U. Yule, “VTiy do we sometimes get nonsense correlations between time- 
series etc.?,” Journal of the Royal Statistical Sod^y^ Vol. 89,1926, pp. 1-64; 
M. S. Bartlett, “Some Aspects of the Time-Correlation problem in regard 
to Tests of Significance,” Journal of the Royal Statistical Society, Vol. 98, 
1935, pp. 536-543, and “On the Theoretical Specification and Sampling 
Properties of Autocorrelated Time Series,” Journal of the Royal Statistical 
Society, Vol. 8, 1946, pp. 27-41; Galvenius and H. Wold, “Statistical Tests 
of H, Alfven’s Theory of Sunspots,” Arkiv for Matematik, Astronomi 
Ochfysik Band 34A No. 24, pp. 1-9; G. H. Orcutt and S. P. James, “Testing 
the Significance of Correlation between Time Series," Biomeirika, Vol. 35, 
1948, pp. 1-17. 

[7] A. C. Aitken, “On Least Squares and Linear Combinations of Observa¬ 
tions,” Proceedings of Royal Society Edinburgh, Vol. 55, 1934/5, pp. 42-48. 

[81 D. G. Champemowne, “Sampling Theory applied to Autoregressive Se¬ 
quences,” to published in Journal of the Royal Statistical Society, Series B 

Vol. 10, 1948. 

[9] R. Frisch, op, cit 

[10] T. Koopmans, “Linear Regression Analysis of Economic Time Series,” 
Netherlands Economic Institute Haarlem 1937. 

[11] G. Tintner, “Some Applications of Multivariate Analysis to Economic 
Data,” Journal of the American Statistical Association, VoL 41, 1946, pp. 
472-500. 

[12] 0. Reiersjfl, “Confluence Analysis by Means of Instrumental Sets of Vari¬ 
ables,” Arkiv for Matematik, Astronomi Och Fysik, Band 32A, No. 4, 1945, 

[13] R. C. Geary, “Determination of Unbiased Linear Relations between the 
Systematic Parts of Variables with Errors of Observation,” Econometrica, 
voL 17, 1949. 

[14] G. H. Orcutt and D. Cochrane, “A Sampling Study of the Merits of Certain 
Transformations in Regression Analysis,” to be published. 

[15] For a general proof, see F. N. David and J. Neyman, “Extension of the 
Markoff Theorem on Least Squares,” Statistical Research Memoirs, Vol. II, 
London, 1938, pp. 105-116; also C. R. Rao, “Generalisation of Markoff’s 
Theorem and Tests of Linear Hypothesis,” Sankhya, Vol. 7,1945, pp. 9-16. 

[16] H. Cramer, Mathemaiicdt Methods of Statistics, Princeton 1946, pp. 548-555, 
and M. S. Bartlett, “On the Theory of Statistical Regression,” Proceedings 
of Royal Society Edinburgh, VoL 53, 1933, pp. 260-283. 

[17] A. C. Aitken, op, cit, 

[18] G. H. Orcutt, “A Study of the Autoregressive Nature of the Time Series 
used for Tinl^rgen’s Model of the Economic System of the United States 



LEAST SQUARES KBGRESSION 61 

1919-1932,” Journal of the Royal StcOistical Society, Vol. 10, Series B, 1948, 
pp, 1—53. 

[19] J. Tinbergen, “Statistical Testing of Business-Cycle Theories Vol. II; 
Business Cycles in the United States of America 1919-1932,” League of 
Nations, Geneva, 1939. 

[20] Herman Wold, “A Study in the Analysis of Stationary Time Series,” Upp¬ 
sala 1938, pp. 171-174. 

[21] Economeirica, Vol. 15, 1947, pp. 111-151. 

[22] Mimeographed paper distributed by the author and the Cowles Commis¬ 
sion for Eesearch in Economics, Chicago. 

[23] Op. dL 

[24] Richard Stone, “The Analysis of Market Demand,” Journal of the Royal 
Sttxtistical Society, Vol. 108, 1945, pp. 286-391. 

[25] The relationship between the ratio of the mean square successive difference 
to the variance and the serial or autocorrelation coefficient for an infinite 
series is given by 


n - 1 - 

where ri is the first autocorrelation. It can be seen that as ri moves from 
+1 to —1 the ratio fi*/a* moves from 0 to 4. 

[26] The actual residuals were not published in the paper by Richard Stone but 
he has very kindly let us have the calculated residuals for 17 equations 
which include some revised estimates and a few additional relationships (see 
Table VIII). 

[27] For a more detailed discussion of reduced form methods see Girshick and 
Haavelmo, op. ciU, especially p. 85. 

[28] J. von Neumann, “Distribution of the Ratio of the Mean Square Successive 
Difference to the Variance,” Annals of Mathematical Statistics, Vol. 12, pp. 
367-395; B. S. Hart and J. von Neumann, “Tabulation of the Probabilities 
for the Ratio of the Mean Square Successive Differences to the Variance,” 
Annals of Mathematical Statistics, Vol. 13, pp. 207-214. 

[29] M. G. Kendall and B. Babington Smith, Tracts for Comyutors No. Cam¬ 
bridge University Press, 1939. 

[30] A. C. Aitken, “On Least Squares and Linear Combinations of Observations,” 
op. dt. 



AOQL SINGLE SAMPLING PLANS FROM 
A SINGLE CHART AND TABLE* 

Donald J. Gbeb 

Chief, Quality Control Engineer, Minneapolie-^Honeywell Regulator Co., 
Minneapolis, Minnesota 

AND 

Julio N. Bbbbbttoni 
Consultant Economist and Statistician 
Minneapolis, Minnesota 

This paper presents a sin^e chart and table from which 
AOQL (Average Outgoing Quality limit) Single Sampling 
Plans may be determined with ease. These plans yield a close 
approximation to minimum inspection both for unknown in¬ 
coming quality and for known average incoming quality unless 
the variation in quality from lot to lot is extremely small. 

AOQL SINGLE SAMPLING PLANS 

C HAET I and Table II present a set of AOQL Single Sampling 
Plans. Their manipulation is simple. Given AOQL and lot size 
(N) locate on the chart the c-zone of their point of intersection. For 
example, if AOQL—1% and iV® 1000, the point of intersection on the 
chart falls between the two parallel diagonal lines of zone This 
value of c is the acceptance number and is the ceiling in number of 
defectives that permits the acceptance of the lot when a sample is 
used. The sample size corresponding to the value of AOQL and c is 
found in the Table of Sample Sizes (Table II) and for AOQL =1% and 
c=1 the sample size is 84. The action that follows is to sample 84 from 
a lot of 1000 pieces and if one or less defective is found accept the lot 
without further inspection and if more than one defective is found reject 
the lot for complete sorting. The results to be expected are (1) there is 
an absolute guarantee that over a series of lots the average per cent de¬ 
fective will not exceed the selected value of AOQL, and (2) unless the 
variation of incoming quality from lot to lot is very small^ the AOQL 
will be maintained with an amount of inspection that is of practical 
significance in approximating the mmiTmim inspection which could be 
obtained if incoming quality from lot to lot were known. 

* Acknowledgment is made to Mr. P. M. Brink of Minneapolis-Hoaeyw^ Hegulator Co. for in¬ 
dispensable assist an ce with the calculations and for his excdlent work in drawing graphs. 

1 If the point of intersection falls on a line, tee the aone directly below. 

* The word *sma]l” is naed here in the sense of bong somewhat less than the drSp* limits of 
-vaiiation. 


62 



AOQli SINOLE SAMPUNG PLAN'S 


63 


The limit ation of Chart I is that it assumes sample dze is small rela¬ 
tive to the lot size. If this is not the case, the sample size is lai^ than 
need be. Ho'^ever, the limitation is not of a serious nature as the differ¬ 
ence between sample size with or 'without the assumption that n is 
small relative to i\r is not of a large order unless the lot size is very small. 

These plans are designed for use whenever there is a desire to main¬ 
tain an average quality over a series of lots. Thus, they may be used 
advantageously for inspections between operations, departments, sub¬ 
assemblies, recdidng inspection, finished products, etc. 


THE BACEGBOXTND OP CHABT 1 ? 

The Derivation of Combinaiiona of Sample Size and Acceptance Number 
Yielding a Selected Value of AOQJj. 

The formula for average outgoii^ quality in terms of the hypei> 
geometric is* 



where m is the number of defectives in a sample of dze n, c is the ac¬ 
ceptance number and p is the per cent defective of a lot of size N. As¬ 
sume that p is less than or equal to ten per cent and that sample dze 


* The following will serve as a useful list of references: Dodge, H. F. and Romig, H. G., Samj^ing 
IiMpedion ToUes. John Wiley d; Sons, Ino., New York, 1945. Freeman, H. A., Friedman, M., Mostdler, 
F., and Wallis, W. A., SampUng Inspedum, McGraw-lBBll Book Co., Inc., New York, 1948. Grant, 
E. L., StatUUcal QwHUy Coitirol, McGraw-Hill Book Co., Inc., New York, 1946. Hod, P. G., IrOroduc- 
Hon to Mathematical StaHstiea, John Wiley A Sons, Ino., New York, 1946. Peach, Paul, An Introduction 
to Induabnal Statistica and Quality Comtrd, Edwards and Broughton Co., Raldgh, N. C., 1945. Wilks, 
S. S., MathemoHcdl Stativlicat Princeton University Press, Princeton, N. J., 1947. Working Holbrook, 
A Quide to the UtUisaiion cf the Binomud and Poieeon Dietrihutione in Induetricd Qiudiiy Control, Stan¬ 
ford University Press, Stanford University, California, 1943. 

Churchman, C. W. and Epstein, B., Teste of Increased Seeerity, Journal of the American Statistical 
Association, Vol. 41, No. 236, December, 1946, pp. 567-590. Dodge, H. F., A Sampling Inspection Flan 
for Cordiwuous Produetion, The Annals of Mathematical Statistics, Vd. SIY, No. 3, September, 1943, 
pp. 264-279. Wald, A. and Wolfowits, Samjiing Inspection Flans for Contimums Fnduetion WhuA 
Insure a Prescribed Limit on the Outgoing Quality, The Annals of Mathematical Statistics, Vol. XVI, 
No. 1. March 1945. pp. 30-49. 

Army Service Forces, Office of the Quartermaster General, Sampling for QwdUy Control {Super- 
viwr*8 Edition), December, 1945. Navy Department, Oenerd Specification for Inepection of Material, 
Appendix X, Standard Sampling Inspection Tables for Inspection by Attributes, April 1946, United 
States Government Printing Office, Wasbington, D. C. 

4 Wilks, S. S., op. oil., p. 223 s^ Hoel, F. G^ op, dt., p. 22A 




64 


AMERICA!? STATISTICAli ASSOdAnON JTOCRNAIi, MARCH 1M9 


is greater than ten. Then a close approxiination is establi^ed by the 
substitution of the Poisson for the hypergeometric distribution. Fur¬ 
ther attune that N is lai^ relative to n and therefore also to m so that 
n/N and m/N are considered negligible. Givdi these assumptions, 
equation (1) reduces to* 


AOQ = pE 

»«-0 


e^»p(np)’" 

m\ 


( 2 ) 



(3) 

(4) 


where p is the absdssa value of masdmization. Let a^{n){AOQL), 
which values^ for integral variations of c from zero to twelve are pre¬ 
sented in Table 1. Sample sizes which in combination with c yield 
selected values of AOQL are readily determined by dividing the 
(n)(AOQL) values by the given values of AOQL. Table II presents 
these sample sizes. 


Equation (2) differs from Dodge and Romig’s equation of AOQ, which ia 


z\—- 

Jf-oLC-V -A1 


(M-m) ‘ 

rfl _ -2- 

W C,^ J 


In words, this equation states that .<10Q Tslues, as calculated by the hyp^geometxic formula of the ac¬ 
ceptance from a sample dse n of c defectives pertaining to a lot of size N and M defectives, are weighted 
by the expected binomial frequencies of defectives, lot size A" and average inooming quality equal to 
p. Thus thdr assumption is that a lot is a sample from a stream of statistically controlled product vary¬ 
ing according to the binomial distribution. The equation can easily be reduced to the summary form of 


A0« - p(l - II/.V) 53-- PV*" 

n «0 (n—«)!i»! 


and by substitating the Poisson for the Binomial the equation given for AOQ in thdr book is obtained 
(op, eii,, p. 48, equation 15). It is to be noted that given our assumption that n/N is small our definition 
does not differ from thmrs and also that with this assumption AOQ is made independent of the binomial 
form of distribution and of N, 

The writers widi to thank Mr. Dodge and Mr. Romig for thdr kindness in conveying to us by way 
of ooneqpondence the underlying aspects of thmr definition of AOQ, 

* PoLseon summation tables of Grant or Molina may be used to calculate n(AOQL) values. Grant, 
£. Ii., op. dt,. Table G, pp. 542-546. Molina, E. G., Poisson's Exponential Binomial Limit, D. Van 
Nostrand, New York. 1947, Table U. 




AOQL SINGLE SAMPLING PLANS 


65 


TABLE I 

VALUES OP n{AOQL)^^a 


e 

np* 

Pa 

nUOQL)** 

0 

1.000 

.367879 

.3679 

1 

1.618 

.519136 

.8400 

2 

2.270 

.604010 

1.3711 

3 

2.045 

.659552 

1.9424 

4 

3.640 

.698775 

2.5435 

5 

4.849 

.728499 

3.1682 

6 

5.071 

.751730 

3.8120 

7 

5.804 

.770495 

4.4720 

8 

6.546 

.786079 

5.1457 

9 

7.297 

.799148 

5.8314 

10 

8.065 

.810388 

6.5277 

12 

9.590 

.828740 

7.9476 


* Accurate to .0005. 

** Accurate to .00005. 


TABLE n 

TABLE OP SAMPLE SIZES 
AOQL —Per oeut 


e 

.10 

.25 

.50 

.75 

1.0 

1.5 


2.5 

3.0 

3.5 

4.0 

4.5 

5.0 

6 

7 

8 

9 

0 

367 

147 

m 

49 

36 

24 

18 

14 

12 

10 

9 

8 

7 

6 

5 

4 

4 

1 

840 

336 


112 

84 

56 

42 

33 

27 

24 

21 

18 

16 

13 

11 

10 

9 

2 

1371 

548 

m 

182 

137 

91 

68 

54 

45 

89 

34 


27 

22 

19 

17 

15 

3 

1942 

776 

388 

258 

194 

129 

97 

77 

64 

m 

48 

43 

38 

32 

27 

24 

21 

4 

2543 

1017 

508 

339 

254 

169 

127 

101 

84 

72 

63 

56 

50 

42 

36 

31 

28 

5* 


1267 

633 

422 

316 

211 

158 

126 

105 

90 

79 

70 

63 

52 

45 

39 

35 

6 



762 

508 

381 

254 

190 

152 

127 

III 

95 

84 

76 

63 

54 

47 

42 

7 




596 

447 

298 

223 

178 

149 

127 

111 

99 

89 

74 

63 

55 

49 

8 





514 

343| 

257 

205 

171 

147 

128 

114 

102 

85 

73 

64 

57 

9 






388 

291 

233 

194 

166 

145 

129 

116 

97 

83 

72 

64 

10 








261 

217 

186 

163 

145 

mm 

108 

93 

81 

72 

12 











198 

176 

158 

132 

113 

99 

88 


c-p and c-N Zones of Minimum Average Inspection and Ihe Construction 
of a Minimum Ins/pecMon Single Sampling Chart 
The formula for average munber of jneces inspected per lot (I) is’' 


I 





mi ) 


(5) 


7 The subscnptB of n are c and AOQLi the oxnissioii of AOQh ia a matter of oonvenienee in notation. 


ssss 

























66 



1000 10,000 100,000 

LOT SIZE 

CHART I. AOQL SINGLE SAMPLING PLANS. 



aoqi< bingiib: sampling plans 


67 


A^nime that N and AOQL are constant.* Equation (5) then algebrd- 
cally characterizes curves illustrated by those in Chart II for the value 
of c from zero to four, given AOQL ==2% and iV=1000.® This chart is 
rignificant in three respects. First, it riiows that as p increases in¬ 
spection curves intersect and form zones of TnininniTn average inspec¬ 
tion for certain raises of p. These are the popular zones introduced 
by H. F. Dodge and H. G. Romig.^® Second, it is instructive in that as 
p varies from zero to one, the c factor ^ving minimum inspection varies 
parabolically with a maximum of three. That is, as p varies from zero 
to one, the variation of c fomung minimum inspection zones is 0,1, 2, 
3, 2,1, and 0. This rignifies that only sampling plans with these values 
of c yield minimum inspection. Sampling plans using c equal to or 
greater than four do not involve minimum inspection with any value 
of p. Third, it points out that the ratio pfAOQL=l is contained within 
the zone formed by the maximum of the c values ]rieldii^ minimum 
inspection. Chart III presents only the segments of the inspection 
curves of Chart II which form zones of minimum average inspection. 
This curve of Chart III is designated as the c-p miniTniim inspection 
curve. If it is assumed that p and AOQL are constant, then equation 
(5) represents inspection curves forming e-N zones of mmimnTn inspec¬ 
tion which are illustrated in Chart lY. In this case, the designation of 
o-N is attached to the minimum inspection curve.“ 

The equation 


n. -f (l\r — «<,) I 1 - 2--;-) 

\ «-.o ml / 


n«+i + (iV - n^i) 




m! 


) 


( 6 ) 


gives values of p and N demarcating the boundaries of o-p and o-N 
zones of minimum inspection respectively. The following equation, 
derived from equation (6) 


s Fixing the value of AOQL determines the values of ne. 

* When only c is specified, the corresponding value of ne is to be understood. 

Op. cit. 

^ e-AOQL sones also may be derived by holding constant p and N. Hoover, these sones have 
only theoretical value and are not discussed here. The sones of minimum average inspection may be 
succinctly analsrsed in terms of differences in sample sizes and amount of detailing as expressed by the 
equation 


fjb-c — Ic « (nt-e - ne) - [(V - nc)Pr/-“ (N - ni^)Pr]^] c+1, c+2*** 2c 



68 


AMEBICAN STATISTtCAI/ ASSOCIATION JOCBNAL, MAECH 1949 


2> 

AOQL 


r ^ 

Uh-iZ. 
L »—0 


pN 


m\ 

e-cPUOQu^acp/AOQLY 

«,! I 

m-O TO! J 


mibo to ! 


(7) 


^ e-aepMO«i(a,p/AOQL)* 
^ to! 


^res an alternate and eaaer method of determining the c values 
yielding minimum inspection. Plotting Np agmnst p/AOQL (Chart 
Y), minimimri inspection zones are described for Np and/or p/AOQL, 
Therefore, if N and AOQL are constant, the c number of the c-p zone 
of minimum inspection is read directly from Chart V. For instance, 
if AOQL’=2%, N =1000, and p—1%, then A’p=10 and p/A0QL<=0.5 
and the coordinate falls in the zone of c=2, whidi residt is the same 
as that of Chart III (or II). If p and AOQL are given, it follows that 
Cr-N zones of minimum inspection are obtained, so that if p and AOQL 
are each equal to 2% and JV=400, then *Vp=8, p/AOQL=l and the 
chart reads c=l. The same result is found in C^rt IV. Therefore, 
Chart V presents a minimum inspection AOQL single sampling chart 
for known values of p and N. 

Chart y is of further interest in that it summarizes the characteristics 
of c-p and c-N zones of minimum inspection. In Chart VI the zones 
throu^ which the dashed cun'es pass are the p/AOQL zones of mini¬ 
mum inspection for the designated Talues of N and AOQL and these 
zones are in direct proportion to c-p zones. It is readily seen, therefore, 
that c-p zones vary parabolically and that p/AOQL=1 is alwajre con¬ 
tained in the zone of the maximum c of TniniTniiTD inspection. Further¬ 
more, it is noted that the number of c zones and the maTiTmiTn <• vary 
directly with N. The zones through which the vertical lin es of p/AOQL 
pass are in direct proportion to e-N zones. If p/AOQL is equal to or Iahh 
than one all zones in Chart V (except that forming the boundary be¬ 
tween c=l and c=0) converge to a point at JVp=infinity and p/AOQL 
=0. Thus, in this region as N approaches infinity, c-N zones also ap¬ 
proach infinity. If p/AOQL is greater than one, all zones become verti¬ 
cally a^mptotic for definite ranges of p/AOQL values so that as N in¬ 
creases c-N reaches a definite maximum value. For sample, if p=1.8%, 
AOQL=1% then every value of N hi^r than 724 will have c=1 for 



AOQL SmOIiB SAMPIilNa FLANS 69 




70 AMBSRICAN STATISTICAL ASSOCIATION JOUENAL, MABCH 1949 




AVERAGE NUMBER OF PIECES INSPECTED PER LOT 


AOQL StNOLS SAMPLING PLANS 


71 



CHAST IV. SNAMPLBI OF e-N ZONES. 


Np - AVERAGE NUMBER OF OtFECTIVES PER LOT 



0 .2 4 6 J 18 12 U 16 f8 20 22 24 

p/AOQL 

OHAET V. AOQL MINIMUM INSPECTION SINGLE SAMPLING CHART 





AOQL SINGLE SAMPLINa PLANS 73 

TniniTmim inspection dnce the zone boundaries are assrmptotic at values 
of tp/AOQL equal to approximately 1.79 and 2.24+. 

AOQL Single Sampling Plans 

The selection of c yielding minimum inspection depends on a knowl¬ 
edge of p and therefore Np and p/AOQL. In practice these values are 
always unknown. However, it is known from Chart V that any arbi¬ 
trary selection of a value of p/AOQL will give a value of c that forms a 
zone of minimum inspection. It is also known that the maviTniiTn c 
number of minimum inspection can be determined from N if the value 
of p/AOQL is assiuned to be 1. This value of c will lead to minimum 
inspection for a certain range of p and deviate from minimum for 
other values of p. Thus, if this value of c is used as a substitute for those 
based on known values of p, inspection over the entire range of p would 
approximate minimum. This is dbown in Chart VII, which compares 
the c=3 iuspection curve (c=3 obtained by assuming p/AOQL=‘\ 
and iV=1,000) and the G-p minimum curve of AOQL=2% and JV= 
1000.“ Similarly, it is known that the use of the ratio p/AOQL equal 
to zero gives the smallest c of minimum inspection, namely zero, and 
that minimum inspection is always obtained for values of p beyond 
2.2A-AOQL?^ If this value of c is used as a substitute for those based on 
known p, a second approximation to minimum inspection is obtained 
for the overall variability of p. Chart VII gives a visual presentation of 
the approximation when <;=:0 is used. However, because of the rapid 
convergence of inspection curves to the value of iV as p varies beyond 
2.2A-AOQL, or stated differently, because of the large amoimt of de¬ 
tailing for any value of c beyond 2SA-AOQL —^never less than 66%— 
it is of little significance from the economic point of view whether the 
minimum c—0 or some other value is used. The importance of ap¬ 
proximation lies in the r^on of p less than 2.2i‘AOQL. In studying in¬ 
spection curves of approximation within this region, it has been found 
that the ratio p/AOQL=0.6 ^ves a better approximation to minimum 
inspection than any other value of p/AOQL?^ For the entire range of p, 
with practical agnificance, the ratio of p/AOQL=0.6 leads to the 
selection of an inspection curve which best approximates the <>p mini¬ 
mum inspection ciurve. 


^ After the ratio p/AOQh haa been assigned, the given value of AOQXi (e^. 2 per cent) is used only 
to detenxnne the aasnidd sues. 

» See Chart V. 

u To obtain the same inspection corves of approximation given by p/AOQL^»0Jit plAOQXe must 
necessarily vary in the lefl^n of p/AOQL>l because of the aaymptotie nature of the e lones in this 
region. 



74 


A^nuRTfiATJ STATtSTICAIi ASSOCIATION JOUBNAL, UABCH 1019 


The selection of c-values based on the assumption of p/A0QL=0.5 
is made easier by constructing Chart I. This chart eliminates the re¬ 
quirement of Chart II of calculating Np. Since from equation (7) 


N 


1 p m 

.40QLr"^"“'o m! 


-aX 


m\ 


( 8 ) 


«_o ml 

c 

-s 

msO 


e-«‘fi-*o9^(aep/A0QL)” 

ml 


and since N, p=AOQLJ2, e, (nc)(AOQL)=ae, and (no+j)(-d.OQL) = 
Oe+i are given, we have that 


where 


N 

logN 


A 

~ AOQL 

“ log A — log AOQL 


A 


^ e-i«rf-»(§Oc+i)'» « er^ihoc)” 

Oe+l , Oc2^ , 

«—0 ml mi 

»-o m! ^0 m! 


( 9 ) 

( 10 ) 


( 11 ) 


and is a constant. Thus, in logarithms the boundary lines between lot 
sises and AOQL are parallel, straight and negatively sloped. 

Comparison o/ Charts I and V Given Knowledge of Average Incoming 
Quality (p) 

Chart y jields m inimuTn inspection if the value of p for an incftmiTig 
lot is known. Thenrfore, Chart V is useful in practice if p is known anH 
if the variation of p is almost completely 'within a sngle zone of mini¬ 
mum inspection. In general, this will require a variation conaderably 
less than the normal plus or minus three s^ma limits of variability. 
Consequently, unless the variation of p is very small. Chart I ^ves a 
better approximation to mmimuTn inspection because of the rn.m .Vini i « 
bdiavior of the c-p zones. 




aoql single sampling plans 


76 



CHART VI. characteristics OP c-p AND oJV MINIMUM 
INSPECTION ZONES. 








76 AMERICAN STATISTICAI- ASSOCIATION JOUBNAL, MARCH 1948 




ON MEASURING LANGUAGES 


Stuakt C. Dodd, Ph.D. 

University of Washington, Seattle 

This paper proposes ten criteria by which the suitability of 
any language for use as an international language might be 
measured. These criteria fall into two classes. The £rst three 
arc criteria of familiarity—^that is, they measure the esrtent to 
which a candidate language is already familiar to the people 
who would have to learn it. The remaining seven are criteria 
of excellence, and are intended to rate languages according to 
such properties as their freedom from local idioms, from ex¬ 
ceptions to the rules of grammar, from infections, and so on. 
Such criteria have three purposes. First, they would rank the 
candidate languages by familiarity and excellence. Second, 
they would diagnose weaknesses in each candidate; from this 
diagnosis a living language could be simplified towards the 
ideal regularity of an artificial language, while preserving 
more of familiarity to the world’s population than an artificial 
language possesses. Finally, they would indicate any progress 
that the world may be making from decade to decade towards 
achieving a single language. 


T hs problem of an international auxiliary language has become in 
part a problem of selecting it from among the three hundred candi¬ 
dates which have been proposed in the last seventy years. To select the 
best candidate requires prior agreement on what is “best/ What are 
the criteria which specify the “best”? This paper proposes ten criteria. 
It further proposes ten indices which measure the degree to which each 
criterion is satisfied by a given candidate language. A weighted sum of 
these indices can then rank the candidates into a relative order. 

The criteria for the best world language may be put into two classes 
—^the practical and the ideal. These are also called the natural vs the 
schematic types when applied to artificial languages. They specify, 
respectively, what is most likely to be adopted by the world and what is 
intrinsically the most excellent as a language. These will be referred to 
here as the criteria of familiarity and the criteria of excellence. 

For it should be obvious that the most practical proposal is one which 
involves the least change or the least amoimt of new learning for the 
world's population. Thus the candidate language which has the largest 
proportion of elements which are already familiar to the maximum 
number of people will encounter least resistance. We can measure the 
degree of familiarity of a candidate language and thus make a crucial 
comparison to prove which candidates are most practical. 


77 



78 


AMERICAN STATISTICAIi ASSOCIATION JOURNAL, MARCH 


THE EAMTT.TARTT T CRITERION 

What language is most familiar to the users of any one language in 
that it has the largest percentage of its words, grammatical forms or 
other elements the same between their language and the candidate 
world language? In order to measure this, several indices, varying in 
completeness are as follows: A first index of familiarity, which may be 
labelled Fi, is calculated by taking as a first step a representative sam¬ 
ple of the elements of the candidate lar^uf^. One such sample might 
be the 1000 semantic words which occur most frequently, as deter¬ 
mined from a semantic word count. (A "semantic” word is defined as a 
word or phrase with a unit meaning i.e. "to look out for” meaning 
“to protect.”) 

The next step is to asdgn to each of these thousand "most frequent” 
words, which serve as a representative sample of the candidate lan¬ 
guage, a value of 1 if it is exactly the same in the national language and 
a value of | if it is partly the same (as in having a root or an a£Sx in 
common). These unit or half unit values are added up and, since they 
will give a thousand points at maxinium, this total will serve as a per¬ 
centage of common vocabulary between one candidate language and 
one national language. This percentage is Fi.^ 

Next there wiU be other values of Fi, one for each national language 
paired with each candidate langu^e. That is, for one candidate lan¬ 
guage there will be as many Fi’s as there are national languages or im¬ 
portant groups of national languages deserving consideration in the 
world. This number of Fi indices will then be multiplied by the number 
of candidate languages i.e. the number of languages for which research 
provides these data and which are considered important enough to be 
likely candidates for a world language. It is obvious that this is an im¬ 
mense project of research for many scholars for many years. 

Th^ Fi indices of familiarity next must be combined into a net 
index of familiarity for each candidate language for the whole world. 
That is, the Fi index for one candidate langur^e must be weighted or 
each multiplied by the number of people speaking the national lan¬ 
guage corresponding to that index. This gives greater importance to 
the familiarity index of a language spoken by a 100 million people than 
to one spoken by one million people. From this democratic process of 
weighting each index by the population to which it applies, there will 
result a sin^e net index of familiarity, which we may call Fa, for each 
candidate language. These indices will rank the different can^dates in 


^ Pi "^i'StV/N) where V rvalue oC 1 or | and N ■■number of words in m-mple studied. 




om MBAStnEtINO LANOTTAGEB 


79 


order of familiarity to the world. The indices reveal the languages at 
the top of the list which deserve further study and also reveal those at 
the bottom of the list which may be dropped from further considera¬ 
tion. 

A number of problems of method will have to be solved in computing 
these indices. For example, in determining the number of people speak¬ 
ing each national language, in order to fix upon a wei^ting of its Fi 
index, allowance must be made for bilingual people or that fraction of 
a population which may speak more than one language. Such persons 
mi^t be counted as | for each of the two languages they may speak 
thus 9 ving them a total weighting the same as for any person speaking 
but one language. Again a further refinement in the indices might be 
to weight each word in calculating the Fi index in proportion to the 
frequency of occurrence of that word. A priori, it seems probable, how¬ 
ever, that if the thousand most frequently used words ate taken as the 
sample in calculating Fi, differences in the frequency of individual 
words would not greatly change the relative size of the Fi indices. 

A third problem is whether to take the total population speaking a 
given national language as a weighting factor in calculating F 2 or 
whether to take some part of it which is mote relevant for international 
purposes. Thus the Iterate part of each national population is probably 
a more suitable number to take as weightii]^ coefficient. This index 
m^t be called Fs, measuring the degree of familiarity to literate 
people. For the literate population represents those who are communi¬ 
cating in international affairs more adequately than the host of illiter¬ 
ates. To include the illiterates would ^ve the 400 million or more 
illiterates of China or India an importance greater than all the Western 
European nations combined. To weight each nation in proportion to 
its literate population would probably be fairer basis, since part of 
learning an international language is learning its written forms. An 
index of familiarity should apply in part to the people who have already 
learned some written form of lai^uage and might have to unlearn and 
reUam an international language more than to the people who have 
learned no written form and to whom learning a new word would be 
little more difficult from learning their own national written forms. 

THE EXCELLENCE CBITEBIA 

In analyzing next, the excellence of any language for international 
communication the following criteria are proposed as hypotheses. 
Some combination of criteria such as these would define what is meant 
by ‘^e most excellent language” for international communication. 



80 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

The eight proposed criteria of excellence are that: its sentences 
should be idiomless and ordered in wording; its words should be uni¬ 
vocal in meaning, flectionless, phonetic in spelling and unique in pro¬ 
nunciation; its letters should be unique in sound and shape. 

A world language should be idiomless. It should not have phrases 
which are local and peculiar to one nation and cannot be literally 
translated into other languages. A world language should have all its 
phrases so logical as to enable literal translation into any national lan¬ 
guage. To measure the freedom from idioms of any language, a list of 
its idioms as found in a frequency count of a representative sample of 
perhaps a million words of prose should be made. The index Ei* would 
be the ratio of the million words of prose examined divided by that 
million plus the number of idioms (including repetitions) found in that 
representative body of prose. If many idioms are found, this ratio 
would be a small percentage. It would become a 100%, indicating a 
language entirely free from idioms, only when no idioms are foimd. 

To detect an idiom, three tests are available. The first test is the 
definition of an idiom as a phrase different in meaning from its con¬ 
stituent words. Another test is to try translating each phrase into each 
of some dozen other representative languages and see whether that 
phrase can be translated literally. Another test is to see if each phrase 
can be expressed in the symbols of modem Symbolic Logic. This new 
science, grown up in the last half century, develops an algebra for words 
and sentences, so that these qualitative symbols can be handled in 
equations with all the precision of mathematics. 

A world language should have the order of the words in its sentences 
obey rules without exception. The rule of course may be very rigid; or 
very flexible as in stating that certain words may occur anywhere in 
the sentence depending on the emphasis desired. Ideally, it is possible to 
conceive of a language in which all word order is determined by one 
rule such as that “modifiers follow that which they modify.” This rule 
would mean that a verb followed the subject and that the object of 
the verb followed the verb whose meaning it completes. This rule would 
mean that adjectives followed the noun they modify and adverbs fol¬ 
low the verbs they modify and every phrase or clause follows whatever 
it modifies. 

The index, which we may label E 2 , which measures the excellence of 
a sentence in having the order of its words abide by rule could be com- 


* Ex «'.V/A’+Jf) wliere X ^number of words in sample studied; M ^number of idioms. 



ON MBAStnaiNG LANGTTAGES 


81 


puted as a ratio of the number of words in a representative sample of 
prose, (perhaps one million words), to the number of these words when 
each is multiplied by its “frequency rank”. This “frequency rank” 
needs explaining. It is determined as follows: For each word in a sen¬ 
tence the rule that determines its position in the sentence is decided 
upon. The frequency of occurrence of each rule must then be deter¬ 
mined in a large samide of prose, and the rules put into a rank order so 
that the most used rule will be given a rank of 1, the next most used 
rule will be given a rank of 2, etc. Each word is then multiplied by the 
rank (whether 1, 2, 3, etc.) of the rule which governs its position in the 
sentence. This multiplication by a rank is “wei^ting” the word accord¬ 
ing to the frequency rank (in this case the frequency rank of the rule 
governing the position of the word in the sentence). By this index,* 
if there is but one rule for all words the weighting factor is 1 and the 
index will be a 100%, as it will be a million words divided by a milli on 
words. If, however, a second rule appears then some of the words in the 
denominator of the index will be multiplied by two and the index will 
be less than 100%. If a third rule appears the index will become still 
smaller in proportion to the nmnber of words covered by that third 
rule. Thus this index becomes smaller in proportion as the number of 
rules becomes greater. 

A hi^ index, therefore, measures mmplicity of language in this re¬ 
spect and a low index measures its complexity or irregularity. The index 
also is proportional to the frequency with which each rule occurs in 
the representative sample that is studied. It should be obvious that 
having a definite word order makes sentences which are clear and 
unambiguous in meaning. If the order of words in a sentence always 
follows some rule, there is little posability of different people inter¬ 
preting the sentence in different ways. Thus a rule-abiding order of 
words is an objective way of measuring and controlling the degree of 
ambiguity in the sentences of a language. This is especially so in a 
language whose words are not inflected (as explained below). 

A world language should have words whidi are uninfiected. This cri¬ 
terion means that no word should ever change its form to express 
a grammatical inflection such as masculine or feminine gendei', per¬ 
sons, or number, tense, voice or mood of a verb or degrees of an ad¬ 
jective. This is the trend of evolution of langui^e. Languages grow 
up with these grammatical inflections in primitive thinking as when 


< where N ^number of words in sample studied; fia »£requBn<^ rank of rule govern^ 

ing the poaition of each word. 




82 AMEBICAH STATISTICAL ASSOCIATION JOTJUNAL, MABCH 1M9 

man ascribed masculine and feminine gender to ail nouns, amply 
because Tna.’n thought of his own difference in sex as existing in every¬ 
thing glsp- around bim. But as people developed towards greater ma¬ 
turity and fle xibilit y in language they dropped these grammatical 
inflections. Some of them are entirely unnecessary. Others are ex¬ 
pressed in uninflected “particles” such as the prepositions and con¬ 
junctions and adverbs like “to,” “as,” “and,” “or,” “not,” etc. Chinese 
has gone furthest in developing a completely uninfiected language of 
root words which can be flexibly combined in different orders to make a 
great variety of meanings. 

The flexibility of uninflected words can be compared to the flexibi¬ 
lity of the alphabet where the dumsy symbols for whole syllables were 
replaced by a few letter symbols for elemental sounds. These letters 
can be flexibly combined to make any word in any world’s language. 
Somewhat similarly, root words and particles yield more flexible 
sentences with a greater range of possible meanings than inflected 
words can do. 

To measure the degree to which a language has progressed towards 
the ideal of complete absence of inflections, an index, which may be 
called Ej, may be calculated from the same representative body of 
prose of perhaps a million words which may be used for calculating 
most of these indices discussed in this paper. The formula for the index 
of freedom from inflections is the ratio of one million words divided by 
those million words each weighted by its “frequency rank of inflec¬ 
tions”. This “frequency rank of inflection” is determined in a way siini- 
lar to the frequency rank of rules in the preceding index, E*. To get it, 
the number of times eadi inflection occurs in the million words is 
coimted, and the frequency of the inflections with one grammatical 
meanii^ ate given the ranks of 1,2,3, etc. Each word is then multiplied 
by one if it is uninflected, by 2 if its first inflection is the most frequently 
occurring one, by 3 if its first inflection is the next most frequently oc¬ 
curring one, by a wei^t of 4 if its first inflection is the next most fre¬ 
quently occurring one, etc. If the word has more than one inflection, 
it will be multiplied by more than one such rank. By this index,* a 
language will be a 100 per cent flectionless only when it uses root words 
and particles, only. It will be less than a 100% flectionless in propor¬ 
tion as: 

a. It has many words which are inflected 


• /SS^ when N BnmoliQr words in sample studied, and St^aeQnensy raxik of 

iaflaetibn of each word. 




ON MEASTJBING LANGUAGES 


83 


b. The inflected words are frequent in occurrence, and 

c. There is more than one inflection to express one grammatical 
meaning. 

Thus a language which has four conjugations for its verbs instead of 
one conjugation will have a larger weighting in the denominator of the 
index and, therefore, a lower index of excellence in respect to being 
flectionless. 

A world langttage should be phoneHe in spelling. This criterion of ex- 
cdlence that every word should be spelled exactly as it is pronounced 
implies the criterion mentioned below that every letter ^ould represent 
only one sound. When the words of a language are spelled as they 
are pronounced, learning to read that language becomes very sim¬ 
ple. If there is much literature and reading matter in one’s environ¬ 
ment, a child wiU learn to read without schooling as automatically 
as he learns to speak by merely being surrounded by people using the 
written language and by his wanting to know what others are writing 
and to write things himself. A phonetic spelling is perhaps the greatest 
aid to make the population a 100% literate. All languages which use 
letters were phonetically spelled at one time of course, but in the case 
of many languages the spelling of a previous century has become stand¬ 
ardized while the pronunciation has changed. Another source of un- 
phonetic spelling, however, is that there are more sounds in a language 
than letters, so that some letters will be used to mean more than one 
sound. Thus English uses 40 sounds, but has only 26 letters in its 
alphabet with a result that its irregularity of spellii^ is greatly in¬ 
creased. 

To measure the degree to which a language is phonetic in spelling 
an index of this criterion of excellence, which we may call Ei, may be 
defined by a ratio calculated from a large sample of perhaps 100 thou¬ 
sand letters as they occur in the representative sample of prose re¬ 
ferred to above. The index might be a 100 thousand letters divided by 
the number of those letters when each one is multiplied by its “fre¬ 
quency rank of pronunciation.” This frequency rank of pronunciation 
is again calculated similarly to the frequency rank of rules in Ej or 
frequency rank of inflections in Es. To calculate it the frequency with 
which each pronunciation of each letter recurs must be counted. Th^ 
for any one letter, its most frequent pronunciation is given a rank of 1. 
Its next most frequent prommciation is given a rank of 2 and so on. 
Each letter in the denominator of the ratio is multiplied by its rank and 
these products are added to make the denominator of E*. By this index, 



84 AMERICAN STATISTICAli ASSOCIATION JOURNAL, MARCH 1949 

a language will be 100 % phonetic only when each letter has one pronun¬ 
ciation and w’hen every word is spelled in a single and phonetic way. 
The phonetic index E 4 ,® of a language will decrease in proportion as 
its letters, as they occur in words with current spelling, have more than 
one pronunciation. 

A world language should have words which are univocal in meaning. This 
criterion of excellence means that every word should ideally have only 
one meaning and every meaning ^ould have only one word to symbol¬ 
ize it. There should be no words with multiple meanings nor should 
there be any synonyms which mean exactly the same. (S 3 monyms with 
slightly different meaning are desirable to express shades of differences 
in meaning and to make a language rich, but words between which 
no differences in meaning can be detected are merely confusing.) This 
is a fundamental principle of symbolism—^that each symbol should 
represent one and only one “referent” or meaning. Obviously, our living 
languages as they have grown up in folk usage have acquired multiple 
meanings for many of their words. Only artificial languages such as 
Esperanto approach the ideal of “one word, one meaning” as they can 
start out afresh by assigning a word or phrase for every meaning listed 
in the dictionary. 

To measure the excellence of language in respect to its words being 
unique in meaning, an index, E5, may be defined as a ratio calculated 
from the same representative sample of a million words of prose which 
has been used previously. This index is one million words divided by 
the number of those words when each is multiplied by its “frequency 
rank of meaning.” This frequency rank of meaning is similar to previous 
frequency ranks. It would require a semantic word count, i.e. a count 
of the frequency of occurrence of each meaning (as listed in the dic¬ 
tionary for each word) in the million word sample of prose. (See Eaton^s 
Semantic Frequency List for English, French, German and Spanish.) 
Each meaning of each word will be given a rank of 1 if it occurred 
most frequently, of 2 if it occurred next most frequently and so on. 
Each word would be multiplied by this frequency rank and all these 
products would be added up to get the denominator of E 5 .® Since no 
complete semantic word counts have been made as yet in the world to 
the author’s knowledge, (although a scientific committee is at work on 
this in the United States) a similar index of uniqueness of meaning of 


* where X «niizDber of letters in sample studied; Re >Bfreq.uency rank of pronunoa- 

tions of each letter. 

> Xi>«(.V/2Ri} whereNanumberof word8in8ainpIe;Rt»fi8quene7iankof meaningofeach word. 



ON MEASTTBINO LANOUAOES 


85 


■words may be Em • Em might be the number of words in the most com¬ 
plete dictionary of a language divided by the niunber of meanings 
listed in that dictionary. This variant index is easily computed but has 
the disadvantages of gi-ving great wei^t to unusual and archaic mean¬ 
ings with which a language may be burdened and ignores the important 
factor of the frequency of the use of a word with multiple meaninp. 

The world langvage ehotdd have uniform pronunciation everywhere. 
This sixth criterion of excellence means that there should be no differ¬ 
ence in different countries in the way any word of the world language is 
pronoimced. By means of the international standardized phonetic 
alphabet the standard pronunciation of every word can be fixed. 
Phonograph records and radio recordings can also fix the pronuncia¬ 
tion. Some people may comment that since languages have changed 
their pronimciation in the past would not the new international lan¬ 
guage also change as a whole or in regional dialects? This is hi^y im¬ 
probable as the modem forces such as radio and other agencies of mass 
communication would increasingly tend to unify and standardize and 
preserve pronunciation. Dialects grow only where people are separated 
with little communication between them. 

To measiue the degree to which any candidate language approaches 
this ideal of universally uniform pronunciation an index of uniformity 
of pronimciation, Ee, may be developed. For this index, a survey of 
perhaps a million words of oral speech would be needed. In this survey, 
a sample of persons representative of the various regions, social classes, 
etc. ■within each national language mi ght be asked to read standardized 
prose into a recording machine. From these recordings, the frequency 
of each pronunciation of each word could be counted, and each pro¬ 
nunciation of a word given a rank. Then the index Ee,’' would be that 
million words di-vided by the sum of those words when each has been 
multiplied by its frequency rank of pronunciation. This index, like 
the previous ones, becomes 100%, lowing complete uniformity of pro¬ 
nunciation, when the rank of every word is one so that the index is one 
million di'vided by one million. In proportion as there is more than 
one pronunciation for each word, the denominator increases and the 
index of uniform pronimciation shrinks. For example, if there were two 
pronunciations only on the average for every word the ranks of 1 and 2 
would occur equally often as weights in the denominator which would 
then ha^ve the average value of 1.6, giving an index of one million 
divided by a million and a half which is 67% of uniform pronundation. 


^ E§^iN/SRt) where N a>iiuixiber of words; JS«»freqaene3r of pronozioiation of each word. 



AMEEICAN STATTSHCAL ASSOCIATION JOT7ENAL, MAECH 1949 


Again, if there were, in general, three pronunciations of every word so 
that the ranks of 1 , 2 , and 3 occurred about equally often, then the 
denominator would be twice the size of the numerator and the degree 
of the uniformity of pronunciation would be only 50%. 

A world language should have every letter unique in shape and sound. 
These two final criteria of excellence of any language apply only to 
its written and printed forms. The second means that every letter 
should have only one pronunciation and every elemental sound in the 
language should have a letter to represent it. The index to measure 
this is included in the index of phonetic spelling, E 4 above. 

For each letter to be unique in shape it means that every letter would 
have one and only one visual form, regardless of whether it occurs in 
print or in hand writing, or whether at the beginning of the word 
(where capitals are used in some languages), or in the middle or end 
of a word. Thus English has four forms for many of its letters and 
Arabic has three forms for many of its letters. To measure the degree 
of uniqueness of diape of letters more exactly a seventh index of excel¬ 
lence, E7, may be defined as the ratio from a sample of a hundred 
thousand letters in the representative samples of written and printed 
prose. This numerator would be divided by the sum of those letters 
each multiplied by its frequency rank of shape. This frequency rank of 
shape, like the preceding frequency ranks, would be determined from a 
count of the frequency of each shape of each letter. Putting them into 
rank order and multiplying each letter by its rank and adding these 
products gets the denominator of the index E7.* It will be 100 % only 
when every letter (including its connection to an adjacent letter) has 
only one shape. 

Seven indices of excellence for any language have been defined above. 
The next scientific step is to combine them into a single index of excel¬ 
lence for any one candidate language. There are various possible ways 
of combining them. The simplest way is to draw a profile graph. This 
means to draw a column showing the percentage value of an index and 
placing the seven columns for the seven indices of one candidate lan¬ 
guage side by side. The broken line across the tops of these seven col¬ 
umns is the “profile” for that language. By drawing and superposing 
profiles for the different candidate languages it might be obvious that 
one or two are far superior in most respects to aU the others (or possibly 
far inferior to the others and so may be dropped from further consider¬ 
ation). 

* El»(N/El) where N ^ntixaber of letters in samjide studied and Er ^freQuencgr rank of shape of 
each letter. 




ON MBASTTEONa LANOUAOB8 


87 


If the profile, however, ^ows several candidates with overlapping 
profiles a more exact method of combining the seven indices into one 
must be used. The simplest way is to get the simple xmweighted aver¬ 
age, Ea, by adding them together and dividing their sum by 7. This 
gives a simple average index of excellence for one language permitting 
its excellence to be compared with the excellence of other candidate 
languages. If more refined weighting is desired, it can be secured by 
having panels of judges who are experts in the science of language dis¬ 
tribute a 100 points to the 7 criteria so as to show the relative impoiv 
tance of each. The average number of points assigned by the judges to 
each criterion would then be a weighting factor for that criterion. This 
wei^ting factor for each criterion would be multiplied by the value of 
its index before adding the 7 indices and dividing to get the weighted 
average index of net excellence for one language, Ew.’ 

Still more refined weighting schemes could be dex'eloped such as one 
based upon the number of man-hours, or the amount of human energy, 
required to learn and use whatever each index measures. Thus if rm- 
phonetic spelling adds 20% of letters to the words of the language in 
general then the writing, typing, and t 3 rpe setting of that language re¬ 
quires 20% more time than a language having phonetic spelling. Simi¬ 
larly, the number of hours required on the average to learn the irregu¬ 
lar flections of a language compared with the number of hours required 
to learn an otherwise equivalent but flectionless language would yield 
a weighting factor for the third criterion dealing with flections. 

THE FDHFOSE OF THE CBITEBIA. 

As a result of the researches outlined above there would be an index 
of familiarity and an index of excellence (such as Fs and Blw) for each 
of the languages, whether an artificial or a living one, which are candi¬ 
dates to become the auxiliary world language. These indices will 
serve three purposes. First they would rank the candidate languages 
and tell which was the most familiar and the most excellent. Thus the 
problem of choosing the “best” world language would find a sdentific 
answer (based on rules in which the subjective element has been 
minimized). 

Secondly, these indices would diagnose and measure weakness and 
the degree of strength in each candidate language whether in its free¬ 
dom from idioms, its regularity in word order, its freedom from flections 
its phonetic spelling, its uniqueness of meaning of words, its uniformity 


• Ew^i^E/vf) where E »each of the preoedingixidices of excellenoe JSi to .^r in turn; W ^weight 
eadx index. 



88 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 194» 


of pronunciation or its uniqueness of the sound and the shape of its 
letters. From this diagnosis any living language could be simplified 
towards the ideal regularity of an artificial language while preserving 
more of familiarity to some part of the world’s population than artifi¬ 
cial languages possess.^^ 

A third purpose of these indices is to measure any progress that the 
world may be making from decade to decade towards achieving a single 
world language. The relative degree of gain among the rival languages 
may be measured from period to period partly by technics of repre¬ 
sentative sampling as in Gallup polls. The degree of a person’s knowl¬ 
edge of a language, averaged for a population, must be also included 
in any accurate measurement. Any trend, however, slight, towards a 
single language eventually sweeping the field and becoming the sole 
auxiliary language would be shown, and its spread could be facilitated. 

10 This has been done for English. The resulting *Model Engli^,” constructed by the author, has 
the regularity of the most ideal artificial language coupled with greater familiarity to more people than 
any rival national or synthetic language. Its indices of excellence are aU 100% and its indices of famil¬ 
iarity are well above all rivals. This ^ves Model English first tank by all criteria for a world language. 




CONFIDENCE LIMITS IN THE NON-PAHAMETRIC 

CASE 


Gottfbied £. Noether 
Columbia University 

The purpose of this article is to give a survey^ of certain 
methods available for finding confidence limits when nothing 
is assumed about the population from which a sample has been 
drawn except possibly that it has a continuous distribution. 
The following three cases are treated: a confidence band for 
the unknown cumulative distribution function, a confidence 
interval for the proportion of a population for which the vari¬ 
ate is smaller (or larger) than a given value, and confidence 
intervals for quantiles. The results have been known for many 
years, but have often been accessible only to those who were 
able to follow rather involved mathematical arguments. It is 
the purpose of this paper to state certain important results 
without referring to any mathematical proofs. 

/ nirodudion. In the application of statistical theory to practical 
problems the normal distribution occupies a predominant position. 
If we can assume that the observations which we have taken have come 
from a normal population a great many of our troubles are over. How¬ 
ever it will often happen that we do not know much more about the 
parent population than is supplied by the sample itself. What can we do 
then? The arbitrary assumption of normality may obviously lead to 
completely wrong conclusions. Having only a scant knowlec^e about 
the parent population we are forced to make very broad assumptions. 
Thus in many cases it may be reasonable to assume that the unknown 
cumulative distribution function (cdf) is continuous. This will be our 
assumption in all that follows unless stated otherwise. In contrast 
to the parametric case when the form of the cdf is supposed to be known 
except for the values of a finite number of parameters this is often 
referred to as the non-parametric case, since a finite number of parame¬ 
ters are not suflBicient to determine the distribution completely. 

One of the important problems in the parametric case is the es¬ 
timation of the imknown parameter—for simplicity we assume that 
there is just one—^by a confidence interval. The question arises if the 
idea of a confidence interval can be extended to the non-parametric 
case. Such an extension was made by Wald and WoKowitz [1]. 


11 should hke to thank Professor Wolfouuts for pointing out the need for such a survey 

89 




90 


iLMEBlCAN STATISTICAL ASSOCIATION JOX7BNAL, MARCH 1940 


Confidence Bands for Unknown Cuinulative DistriJyution Functions. 
Before going into the non-parametric case it may be well to review 
briefly the basic idea underl 3 dng confidence intervals in the paramet¬ 
ric case. Let the random variable X have the cdf F(a;, B), i.e., 
P{X^x\e]=F{x, ^), where as usual denotes the probability 

of A computed under the assumption that B is true. The fimctional 
form of F{x, B) is supposed to be known. The only unknown quantity 
is the true value of the parameter B. 

The next best thing to knowing the exact value Bo would be to have 
an upper and lower bound for Bq. Thus we are led to try to determine 
two numbers U(xi, • • •, «„) and L(xi, • • •, a:«) depending on n ob¬ 
servations Xi, • • • f Xn on the random variable X such that L(xi, • ■ •, 
Xn) ^ Bo^U(xi, • • •, Xn). HowGver this is only possible if we are willing 
to take a definite risk of obtaining incorrect limits. Thus we can fiix 
a confidence coefficient a, 0 <a <1, and then determine two expressions 
L and U —^for simplicity we omit from now on to indicate that L and 
U depend on the observed sample values—^in such a manner that 
L^&o^U with probability a. In other words if we perform a great 
many experiments, compute each time L and V corresponding to the 
same confidence coefficient a, and state every time that the true 
parameter value lies between L and 17, we shall in the long run 
make correct statements 100a% of the time. 

In the case when the form of the distribution function is known 
except for several unknown parameters confidence regions can be 
defined in a similar manner. However, as we have seen, if we assume 
only that the unknown cdf is continuous a finite number of parameters 
is no longer sufficient to specify the distribution completely. Now in 
order to know F{x), we have to know its value for every a?, — oo <x 
<+ 00 . Thus instead of looking for two numbers L and 17 as in the 
parametric case we should now look for two functions L{x) and U{x) 
defined for all x and then state that 

(1) L{x) ^ F{x) g U{x)j — 00 < a; < + 00 . 

As in the parametric case we ou^t to indicate that both L{x) and 
U{3^ depend on the observations a;i, • • •, x^ but for simplicity shall 
again omit to do so. As before, it is impossible to determine L{x) 
and U{x) in such a way that (1) is always true, but again we can fix 
a confidence coefficient a, 0<a<l, such that in the long run (1) is 
true 100a% of the time. We shall say that L{x) and TJ{x) determine a 
confidence band for the unknown cdf F{x) corresponding to the con¬ 
fidence coefficient a, meaning that the band determined by the graphs 



CONUDENCB LIMITS 91 

of L{x) and TJ{x) covers the graph of F{x) completely with probabil¬ 
ity a. 

As in the parametric case there are infinitely many ways of deteiv 
mining L{x) and U{,x). Very little work has been done so far in findmg 
confidence limits in the non-parametric case which could be termed 
hed. However from the standpoint of facility of application one class 
of fimctions L{x) and TJ{x) seems to have definite advantages. Before 
we can describe this class of fimctions we have to define first what we 
mean by the sample cdf. 

Let again Xi, xt, •••, x„ he a, sample of size n from a population 
haTin® cdf F(a:). For simplicity we shall assume that • • • ga:». 

Since we are not interested in the order in which the sample was 
drawn this is no restriction. The sample or empirical cdf Fn{x), as it is 
sometimes called, is now easily constructed. It is the step function 
which is equal to 0 for x<xi and equal to 1 for x^Xn, while increaring 
by l/« at each of the values i=l, 2, • • •, n. Thus we can write 
Fn(x)—l/n (the number of observations which are ^®). It can be 
shown that as n^oo F„(x) converges stochastically to F{x), or, mother 
words, that as n increases we can be almost certain that Fn(x) ap¬ 
proaches F(x) more and more. 


DeterminaHm of L(x) and U(x). The convergence of F»(x) to F(x) 
which we have just stated suggests the following method of defining 
the lower and upper boundary of a confidence band for F(x): 


( 2 ) 


L(x) = 


Fn(x) — d 

.0 


if F„(®)-d>0 
otherwise 


U{x) = 


Fn{,x) H- d if Fn(x) -f d < 1 
.1 otherwise 


where d>0 is a constant determined in such a way that (1) is satisfied 
with probability a. Obvioudy, d is a function of a and n. 

No formula is available to determine the value of d for any given 
a and n. However, Wald and Wolfowitz [1] have shown how to com¬ 
pute a when d and n are given. Though it would appear from (1) that 
a should also depend on F{x), this is, fortunately, not the case. Thus a 
double entry table for a corresponding to values of d and n could be 
computed. fVom such a table the value of d which for a fixed n cor¬ 
responds to a given value of a could be found by interpolation. 

Tlie following scheme shows the computation that has to be pei> 
formed to find a. Compute 2n numbers, a< and bi, i=*l, 2, • • •, n, 



92 

where 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 


€ 6 * — 



5.= 


0 

» — 1 + nd 


n 

1 


otherwise 
i — 1 + nrf 

if - 

n 

otherwise. 


<1 


Define n functions Pji+i(a:), fc=0, !,•••, n—1, with the help of 
the lecuraon formula 


where 


Then 


Po(®) ® 1, 


P».+i(*) — f i 

«JL+1 


Pi{t)dt 



U X < bi^i 
otherwise. 


a - nWJl). 


It is easily seen hov, the fact that we have to consider various upper 
limits at each application of the recursion formula makes the computa¬ 
tion of a very cumbersome. However there is a very good approxima¬ 
tion to a invohing considerably less computational work. Set a=2a—1, 
where 5 is determined in the following way. Take the same a», f=l, 2, 
• • •, 71, as before. Define n functions Tk+i(x), fc=0, !,•••, n—1, 
T^ith the help of the recursion formula 


PoW s 1, Pji+i(a;) = r Fi{t)dt. 

Then 

(3) a » n!?„(l). 

Now only one integration is necessary at each application of the re¬ 
cursion formula. This procedure can be reduced to the evaluation of 
a determinant of order n-f-1 [1]. Above approsimation of « with the 
help of a is such that it increases our protection in the sense that the 
true probability of our confidence band covering F(x) completely may 
be larger, but is never smaller than 2a—1. 



CONZIDBNCIli UMTTB 


I)ise(mMnu(ms cdfs. As mentioned in the banning we have assumed 
throughout that the unknown cdf is continuous. Under this assumption 
the probability that two sample values are equal is zero. In practice, 
however, due to limitations of measurement, it is quite possible that 
two or more sample values turn out to be equal. In such a case it may 
happen that for some sample value x—Xi, say, the upper limit U(x) 
for x<Xi is lower than the lower limit L(x) for x^Xi, maTring it impos¬ 
sible to draw a continuous cdf between L(x) and U(x). In such a case 
we ^ball be justified in asserting that the true distribution has a dis¬ 
continuity &tx=>xt. 

The question arises what is the probability that our confidence band 
constructed as described above covers completely the true cdf if it 
should be discontinuous. We can no longer state that this prob¬ 
ability is exactly a, but it has been shown that in this case we have 

P{Lix) g Fix) ^ U(®)} ^ a 
so that actually our protection is better than claimed. 


AsymptoHe Bestdts. It is evident that it would be very desirable to have 
tables of the kind described earlier. In the meantime certain assunp- 
totic results are available. If we let 

(4) X *= dVn. 

Smirnoff [2] generalizing a result by Kolmogoroff has diown that 


( 5 ) 


lim a = 1 - 22) 


Since (5) contains a very fast convm'sng series, it is not difficult to 
compute X corresponding to a given confidence coefficient a to any 
desired degree of accuracy. The corresponding value of d is eaaly 
determined from (4) to be 


( 6 ) 






A short table of X-values is given in Kolmogoroff [3]. These values are 
based on a table by Smirnoff [4]. 

Formula (4) may also be used in a different way. For a given inves- 
ti^tion we may want to fix not only the confidence eoeffident a but 
also the width of the confidence band we are going to obtain before- 



94 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, SIARCH 1948 

hand. Then with the’^help of (4) we can detennine the size n of a 
sample which will satisfy these requirements, 


(7) 


n 


X* 



Thus if we should decide that we want a 96% confidence band such that 
the upper and lower bounds for F{x) should not be more than .2 apart, 
we have to take X=1.35 and d=.l. Substituting in (7) we find that we 
shall have to use a sample containing at least 183 observations. 

It is interesting to note that the width of the confidence band is 
inversely proportional to the square root of the sample size. This is 
equivalent to saying that the accuracy of our statements is directly 
proportional to the square root of the sample size. 

One-Sided Limits. As in the parametric case when it may happen 
that we are only interested in an upper (or lower) limit for the unknown 
parameter, it may happen in the non-parametric case that we only 
need an upper (or lower) limit for the unknown cdf. Exactly the same 
method applies as in the two-sided case, except that now we have to sub¬ 
stitute S as computed by (3) for a, i.e., we can state with confidence 
coefficient a that (or L{x)SF{x))^ — oo <a;< + oo. 

An asymptotic approximation is also available. Smirnoff [2] has shown 
that 

(8) lim a = 1 — 


where again \ ^dy/n as in (4). (8) is very easily solved for X. Indeed 
we get X*s +\/—ilg(l—a) where the logarithm is the natural or Na^ 
peiian logarithm. 

Goodness of Fit It may be worth while pointing out that Hx) and !/(«) 
can also be used to test the hypothesis that the unknown cdf f\x) 
^Ffi{x)j where Fq{x) is some given cdf. If we reject this hypnth ftsna 
whenever Fq{x) intersects either L(x) or TJ{x) or both, we shall be nsitig 
a test with a critical region of size l—o. 

Confidence Bands Giving Confidence Intervals for F{x) at a Specified 
Valve X. The method which we have just discussed assuies us a con¬ 
fidence band that with probability a covers the true cdf F(x) in its 
entirety. It is not difficult to see, however, that for any given x^x^ 
say, the probability that the corresponding interval [L(a;o), U(xq)] 



COmTDBNCE UMITS 


95 


will contain the troe value F(pso) is often considerably greater th».n a. 
It follows that if we are only interested in finding a band which for a 
given, though arbitrary value Xo contains the true value F(xo) with 
probability a, we can use a narrower band, thus increasing the ac¬ 
curacy of our statements. Such a band is easily found. In fact, let 
F(xo)=‘p. Then p can be considered as the unknown parameter of a 
binomial variate Z defined by P{X^Xo} =pandP{Z>a:o} ^1—p—q. 
We have reduced the problem to a parametric one the solution of 
which is well known. 

Using Fnixt), where Fn(x) is the sample cdf as before, as the sample 
estimate of p we can find two quantities L'(xi,) and U'(xo) in such 
a way that P{L'(®o)^P(»o)^17'(»o)} =a. Now xt has been com¬ 
pletely arbitrary. Letting it take all values from — <» to -J- <» we get 
two functions L'(x) and U'(x) determining a confidence band which 
satisfies our requirements. To distinguish between the two confidence 
bands we have obtained we shall refer to the one given by L(x) and 
U(x) as the type 1 and the one given by L'(x) and U'(x) as the type 
2 band. 

If the sample size n is sufficiently large so that we can use the normal 
approximation to the binomial distribution L'(x) and U'(x) are given 
by 


( 9 ) 




'P„(®)[l-P„(®)]. 


H 

n 

- 

4b*J 

.n 

'p;:w[i-p»(®)] 



» 


4n*J 


respectively, as is shown, e.g., in Cram6r [5], p. 514, Ex. 2, where t is 
the 100(1—a)% value of a normal deviate. 

Let w(x) = U'(x)—U(x) be the width of the type 2 confidence band. 
From (9) we find 


( 10 ) 


w(x) = 


» + i* r n 4n* 


Thus the width is no longer a constant as it was for the confidence 
band of type 1, but is now a function of x having its maximum value 
for those x for which P«(®)=1/2 or is closest to 1/2. 

It is instructive to compare this TnaTrimuTn width with 2d, the width 
of the type 1 band, in an example. Let a=.95 and n=216, the dze of 



96 AMEEICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

the sample we gha.n consider later. We find 2d=2X/\/n=2.7/'\/216 
— .184. For Fa(fl;) = l /2 (10) reduces to i/Vw+^- From a table of 
normal deviations we find i=1.96. Substituting we get w = .132, which 
is quite an improvement over .184. 

So far we have assumed that we were able to make use of the nor¬ 
mal approximation to the binomial distribution. If this approximation 
is not accurate enough, we have to compute binomial confidence iuter- 
vals as shown by Clopper and Pearson [ 6 ]. Now for 
i=0, !,•••,%, — 00, + ^ve 

(11) L\x) = 1 ?*, I7'(a;) = 

where 17 * is the lower, 6 * the upper binomial confidence limit cor¬ 
responding to the observed frequency ratio Fn(xjk). The original 
graphs of Clopper and Pearson together with others have recently 
been reproduced in Eisenhart [7], pp. 332-335. These graphs show ?; 
and B as functions of the observed sample ratio for 7 i= 10 , 15, 20, 30, 
50, 100, 250, 1000 corresponding to a=.95, .99 and for w=5, 10, 15, 
25, 50, 100 , 250, 1000 corresponding to a=.80, .90. For other values 
of n 17 and B have to be found by interpolation. 

The exact values of 17 ^ and Bh corresponding to any n are given by 

(12) 7„(fc, n - & + 1) = 1 - a/2 

(13) Ji^,(n - ft, ft + 1) = 1 - a/2, 

where 7,(p, fir’ll —is the incomplete 

beta function. By definition 170 = 0 , ^»= 1 . It is sufiBlcient to solve either 
for the 17^8 or the i?’s since by ( 12 ) and (13) 

= 1 

To find, e.g., 17 * we can make use of the tables of percentage points 
of the incomplete beta fxmction by Thompson [ 8 ], entering these ta¬ 
bles with j^i= 2 (w—ft+ 1 ) and 2 ft on the page giving the 100 ( 1 --a/ 2 ) 
peircentage points. 

When introducing the confidence band of type 2 we stated that we 
wanted a confidence band such that the probability that for any ar¬ 
bitrary U{x^ g U'(xq) was a. This statement may have been 

somewhat misleadiag. We cannot make this probability exactly equal 
to a. For large n we used the normal approximation, thus committing 
a slight error, while for small n due to the discontinuous character of 
the binomial distribution exact confidence intervals do not exist, and 
we have to be satisfied with the statement that the confidence coef¬ 
ficient is ^o. 



CONFIDENCE LIMITS 


97 


It may be well to illustrate the use of the two t 3 rpes of confidence 
bands by some examples. An economist may want to analyze the in¬ 
come structure in a given community. If he is interested in the 
distribution of income over a specific range he wiU need a type 1 band, 
since he is looking for the joint occurrence of certain events. If, on the 
other hand, he only wants to state that at least 1% and at most u% 
of the population earn no more than $.... a year a type 2 band will give 
him the appropriate answer. 

It may hapi)en that we are not interested in an upper but only in a 
lower limit. A social worker who wants to prove the need for a new 
hospital in a certain section will want to state that at least such and 
such a percentage of the residents earn less than $.... a year and thus 
-eannot afford to go to a private hospital if the need should arise. 
The answer will be given by a type 2 band constructed with the help 
of one-sided confidence intervals. Then while L'(a;) = i 7 jL, 

where again ry* is the solution of (12), except that this time the right 
hand side should read 1—a. 

Cmjidence Intervals for Quantiles. A confidence band of type 2 can also 
be used to construct confidence intervals for quantiles, i.e., confidence 
intervals for the value qp for which F{qp)^p, 0<p<l. Such a confi¬ 
dence interval consists of all those values x for which L'(x) <p < U'(x). 
These values of x are boimded by two observations, Xi and a;,*, say, Xi 
being the smallest value for which U'{x) >p, xj the smallest for which 
L'{x)'^p. If Of is the confidence coefficient connected with our confi¬ 
dence band, we can then say that 

Xi^qp^ Xi 

is a confidence interval with confidence coefficient a for the unknown 
quantile i.e., in the long run confidence intervals chosen in this 
way will include the true value g, 100a% of the time. 

If a type 2 confidence band has been constructed x^ and Xi can easily 
be read off. However, we can find the two values also algebraically. 
To be exact, using the formulas for L'{x) and U'{x) we can find two 
integers i and j such that the corresponding observations Xi and Xj 
are the two observations we are looking for. 

For large n the type 2 confidence band is given by (9). Let as usual 
[y] stand for the largest integer Sy and set 


( 14 ) 


yi = np- iVnp(l - p), 
y 2 ^ np + tVnp{l — p), 



98 AMEEICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

\rhere as in ( 9 ) i stands for the 100(1 —a)% value of a normal deviate. 
Then i andj are found to be given by 

(15) i = [yi] + 1, = [ 2 / 2 ] +1, 

unless y 2 is an integer itself, in which case 

Since we are only using an approximate confidence band, the solu¬ 
tion given by (15) may sometimes lead to a confidence interval the true 
confidence coefficient of which is slightly <a. Therefore, a safer, though 
very often unnecessarily wide confidence interval is given by i= [j/i], 
[ 2 ^ 2 ]+1. It should also be remembered that if p (or 1 —p) is small 
the normal distribution is not well suited to serve as an approximation 
to the binomial distribution, even if w is relatively large. 

If the type 2 confidence band is given by (11) i and j are found to 
be defined by the relations 

^*-1 g P, Bi > p, 

Vj-i < V, ‘ns ^ Pi 

where the 17 's and ^’s are given by ( 12 ) and (13). 

A case of special interest arises when p=l/2. Then q is the median 
M of the imknown distribution. For this case Nair [9] has tabulated 
the values of i andj for n^81 corresponding to a=.95 and ?i^76 cor¬ 
responding to a=.99. 

We shall close with one specific application. In biological work it is 
often important to find a confidence interval for the median lethal dose 
of a given drug. This is the dose which would kiU 50% of the animals 
of a given population. In the second experiment described in Bliss [10] 
the individual lethal doses of digitalis (cc. of tincture per kg. of life 
weight) of 216 test animals, in this case cats, were obtained. These 
data,® after adjustment for laboratory differences, can, with some cau¬ 
tion,® be considered as a random sample of individual lethal doses. 

The usual procedure for finding a confidence interval for the median 
dose is to take logarithms of the observ’^ations and assume that the log- 
doses are then normally distributed. If we are willing to make this as¬ 
sumption a confidence interv^al for the median log-dose can be found 
by a well known procedure which, however, involv’es considerable com¬ 
putation. The fact that the log-doses are approximately normally dis¬ 
tributed has been observed in many experiments of this kind, but Ma¬ 
ther [12], esp. p. 240, warns that for any new drug or new method of 


3 I want to tbank Professor Bliss for putting these data at my disposaL 
* See in this connection [ll ]. 



COKHDENCB LUOTS 


99 


preparation the logarithmic transformation has to be justified anew. 
Then what shall we do in such a case if we hare a sample of observa¬ 
tions which is obviously not large enough to carry out a conclusive 
test of normality? An unjustified assumption of normality may lead 
to a serious mistake in our conclusions. However, we may use the more 
general procedure outlined in this section. The assumed continuity of 
the unknown distribution function is hardly a restriction. Beddes, 
Sche£f4 and Tukey [13] have shown that the same procedure is ap¬ 
plicable for all possible cdf’s, if we only change the confidence statement 
di^tly. 

Let us now return to the data mentioned above and find a confidence 
interval for the median dose without assuming that by transforming to 
logarithms we can make use of normal theory. Substituting p=1/2 and 
n=216 in (14) we find at the 95% level 

yi = 108 - .98V^ = 93.6, 
y* = 108 + .98V^ = 122.4, 


or by (16) 


i = 94, j = 123. 

All that remains to be done is to order the observations in increasing 
order of size and select the 94th and 123rd observations. Thus we find 
the confidence interval 

.640 g M ^ .671. 

It may be worth while emphasizing at this point the extreme ease 
with which all the non-parametric procedures we have encountered 
can be carried out. 

A more important problem of biological assay than determining a 
confidence interval for the median dose is to find a confidence interval 
for the relative potency of a preparation of unknown potency com¬ 
pared with that of a standard preparation. As a matter of fact the 
data we used were originally of this kind. Again this problem is usually 
solved on the baris of normal theory, however, non-parametric methods 
are also available in this case. I hope to be able to come back to this 
problem at some later date. 

One more remark before concluding—^it is often stated that non- 
parametric methods lead to much wider confidence intervals than 
parametric methods, in particular the use of normal theory. This is 



100 


AMEBICAN STATISTICAL ASSOCIATION JOTJBNAL, MABCH 1949 


of course true, and parametric methods should be used whenever 
there is a sound basis for their application, unless the greater ease of 
non*parametiic methods should outweigh the advantage that can be 
gained by the use of parametric methods. However, to assume that a 
sample has been drawn from a normal population for the sole reason of 
obtaining narrower confidence intervals is to defeat the purpose of 
statistics. 


BEFEBENCES 

[1] A. Wald and J, Wolfowitz, “Confidence limits for continuous distribution 
functions,” AnndU of Math. Stat., vol. 10 (1039), pp. 105-118. 

[2] N. Smimofi, “Sur les hearts de la courbe de distribution empirique,” Recueil 
Maihimatique (Matemaiicheshi Sbomik), New Series^ vol. 6 (1939), pp. 3-26. 

[31 A. Kolmogoroff, “Confidence limits for an unknown distribution function,” 
Annais of Math. Siai., voL 12 (1941), pp. 461-463. 

[4] N. Smirnoff, “Table for estimating the goodness of fit of empirical distribu¬ 
tions,” Annals of M<tth. Stat., vol. 19 (1948), pp. 279-281. 

[5] H. Cram4r, Maihemaiical Methods of Statistics, Princeton, 1946. 

[6] C. J. Clopper and £. S. Pearson, “The use of confidence or fiducial limits 
in the case of the binomial,” Biometrika, voL 26 (1934), pp. 404r-413. 

[7] C. Eisenhart, M. W. Hastay, and W. A. Wallis, Selected Techniquea of Stc^ 
tisticed Analysis, Xew York, 1947. 

[81 C. M. Thompson, “Tables of percentage points of the incomplete beta func¬ 
tion,” BiofnetrUsa, vol, 32 (1941), pp. 151-181. 

[9] K. R. Nair, “Table of confidence intervals for the median in samples from 
any continuous population,” Sankhya, vol. 4 (1940), pp. 551-558. 

(101 C. I. Bliss, “The U.S.P. collaborative cat assays for digitalis,” J, Am. 
Pharm. Ass% vol. 33 (1944) pp. 225-245. 

[Ill C. I. Bliss and M. G, Allmark, “The digitsdis cat assay in relation to rate of 
injection,” J. of Pharmacology and Exp. TherapeiUics, voL 81 (1944), pp. 
378-389. 

[12] K. Mather, Statistical Analysis in Biology, New York, 1947. 

[131 H. Scheffd and J. W. Tukey, “Xon-parametric estimation. I. Validation of 
order statistics,” Annals of Math. Stat., vol. 16 (1945), pp. 187-192. 



ON A METHOD OF ESTIMATING BIRTH AND DEATH 
RATES AND THE EXTENT OF REGISTRATION 

C. Chahssa Sbeas 

AU-India Institute of Hygiene and Public HeedA, CcdcuUa 
(on loan to the PoptdeUion Division of the United Nations) 
iND 

W. Edwasds DBMma 
Bureau of the Budget, Washington 

A MATHEMATICAL THEOBT is presented which when applied to a com¬ 
parison of the registrar’s list of births and deaths with a list 
obtaiaed in a house-to-house canvass, gives an estimate of the total 
number of events over an area m a specified period; also the extent 
of registration. 

In the development of the theory, allowance is made for the fact that 
the chance of an event being missed on one list (registrar’s list or the 
house-to-house canvass) may not be independent of its chance of being 
missed on the other list. Where there is likely to be lack of independ¬ 
ence, a test is suggested and a method introduced to reduce the effect 
of dependence. This is done by subdividing the data into small ho¬ 
mogeneous groups, such as mi^t be formed by small areas, sex and age 
classes, domiciliary and institutional births; then by estimating the 
number of events in these groups separately and summing thm for a 
total. The standard errors of the estimates are given. 

The theory is applied to an enquiry that was conducted in February 
1947 over an area known as the Singur Health Centre, near Calcutta, 
covering the years 1945 and 1946 separately, and it is found that the 
estimated total number of events for the area is usually greater when the 
estimate is built up by summing the totals for individual groups than 
when it is computed at once for the abrogated population. According 
to the theory this observation confirms positive dependence and in¬ 
dicates that the greater figure is nearer the truth. 

The anmifl.] number of births and deaths in the Singur Health Centre 
(total population 64,(MX)) is estimated subject to a standard error of 
from 1 to 3 per cent, and the registration is estimated to vary from about 
40 to 70 per emit with a standard error of about 3 per cent. This 
enquity provides basic ground work for the design of future surveys, 
and it is estimated that at a cost of Rs. 10,000 to Rs. 15,000 (3 rupees 
to the U.S. dollar) estimates of birth and death rates for an entire Dis¬ 
trict in India with a population of one to two millions can he obtained 
with an overall standard error of about 5 per emit. 


101 



102 AMEBICAN STATISinCAli A8SOCIATIOK JOXJSNAL, UABCH 1940 

Purpose. The purpose here is to present a theory by which when vital 
r^stration is incomplete, an enquiry in the form of a house-to-house 
canvass may be used in conjunction with the re^strar’s list to estimate, 
i. the total number of births and deaths in an area over a specified 
period; ii. the birth and death rates; Hi. the deficiencies in registration; 
and iv. the standard errors of all these estimates. The theory will 
first be presented, then applied to particular surveys in the Singur 
Health Centre. 

Method of enquiry. The application of the theory whidi is to be devel¬ 
oped requires a comparison of the entries on: 

1. The registrar's list (referred to as R) 

2. The result of a complete house-to-house canvass carried 
out by an interviewer (referred to as J) and the classification of 
the entries on these lists into the following four exhaustive groups: 

C, the number of entries recorded in I and also in R (such 
entries, being found on both lists, are assumed to be correct 
without investigation). 

IVi, entries recorded only in R but not in 7, and after in¬ 
vestigation found to be correct. 

Nt, entries recorded only in 7 but not in R, and after in¬ 
vestigation found to be correct. 

X, entries recorded on one list or the olher, but not both, and 
found after investigation to be incorrect. 

This is a complete dasrification of the entries on the lists but not of 
the'events. There will also be a number Y of events which are missed 
by both lists; this number will be estimated later by application of the 
theory. 

Theory. Let N be the total number of everts (births or deaths) in the 
specified'period. Then an estimate of iV is furnished by the formula 
2^=C-|-JVi-|-iV2+i\riiVs/(7 wherein Nifft/C is an estimate of the num¬ 
ber of events Y missed by both R and 7. This formula of estimation 
assumes that the chance of an event bring nrissed on rither list is in¬ 
dependent of the chance of being missed on the other. A method is 
presented later on for investigating the validity of the assumption of 
independence and for introducing a modification where necessary. 

It can be shown that: e. is an rmbiased estimate in the limit when 
N becomes large and the assumption just mentioned is valid; ii, the 



estimating birth and death bates 


103 

TWft-giTniim likeli hood estima te is equal to iV' in the limit; m. the standard 
error of iV is y/Nq^qt/pipt. The last formula will be developed in the 
appendix. Here, 


Pi = the chance of B detecting an event 
Pa the chance of I detecting an event 
Pi + 3i •= Pi + 3* = 1. 

It follows that the better the performance in eiUier R or I, the hi^er 
be pi or Pa, the smaller be 31 or 3 s and the more precise be the estimate 
^ of the total of events. It follows, moreover, that the precision of If, 
expressed as a proportion (namely as a coefficient of variation), is 
VqiQi/Npipa, wherefore if the theory be applied over an area large 
enough to contain a large number N oi events, the total number N of 
events will be estimated with great relative preciMon. 

The symbol pi is a measure of performance of the registrar, an es¬ 
timate for which is ?i=C/(C'+JV*). This estimate ^ of pi is subject 
to a coefficient of variation of 

/ qi 

^\C + Na)pi' N-1 

This error decreases as C+Na increases. For perfect performance on the 
part of the interviewer, C+Na=N', and there is then no error in es¬ 
timating the i)erformance of the re^strar. 

The foregoing development is overdmplified. In practice there are 
some problems to take account of—^incomplete investigation of the B 
lists; incomplete coverage of the population in the house-to-house 
canvass. Special t 3 rpes of events, like those occurring in institutions, 
are best taken care of as a separate group. Then igain there is the 
problem of investigating the assumption mentioned above, and of 
measuring and correcting for the correlation between the chance of an 
event bdng missed by R and being missed by I. These points will be 
examined in the following paragraphs. 

Effect of incomplde investigation of the registrar's lists. In the investiga¬ 
tion of the ^lists there may be some entries left over unclassified by 
reason of incompleteness of entry, illegibility, or simple failure for any 
reason whatever on the part of the investigator to finieh his job. So loi^ 
as the correct entries amongst the unclassified entries on the .B-list 
constitute unbiased samples from the two categories C and Ni men- 



104 AUSBICAN STATISTICAL ASSOCIATION JOCBNAL, IIABCH 1048 

tioned earlier, the omissioii of the unclasMfied entries from the oscula¬ 
tions does not affect the estimation of N, the totS number of events. 
The estimate of the extent of registration will be too low if the un¬ 
classified entries contain, as is likely, correct entries classifiable as C. 
If the unclasafied entries are all counted as correct, contrary to fact, 
the calculations will lead to an overestimate of the extent of registra¬ 
tion. 

Effect of incompleteneas of coverage of the populaHon. As in every 
population enquiry, there will be some failures to elicit information 
from all the households. This will happen when some households in 
which an event took place have moved away temporarily or per¬ 
manently, or when no responsible person can be found at home to give 
the information. So long as the events in the uninterviewed portion of 
households are included in the 22-list to the same extent as those in 
the interviewed households, the estimation of AT is unaffected. The 
calculation of N may therefore be little affected by incompleteness of 
coverage of the popifiation. 

The effect of insHtuHondl everUs. In rural areas the bulk of the births 
are domiciliary, but there are some small scattered hospitals drawing 
patients from a wide area, and a high proportion of the events that take 
place in them are for non-readents. The E-list may contain some or 
even all of the entries for these institutional events because the 
registrar is able to ascertain this information easily and accurately from 
the institutions. The interviewer, on the other hand, will, by the na¬ 
ture of a house-to-house canvass, fail to discover an institutional event 
concerning people who had no family connections in the area. In¬ 
stitutional events, as they are accurately ascertainable, are best 
handled as a separate block and not as a problem of estimation. 

The effect of correlation between events missed on both Usts. The first 
step is to define this correlation. The registrar and his co-workers 
will detect some events and miss others. The probability that the in¬ 
terviewer [2] will detect an event that was missed by R may be differ¬ 
ent from the probability that he will detect an event that was recorded 
bj' B. If these two probabilities are equal there is complete independence, 
but otherwise there is not, in which case the formula ^ven above for 
the estimation of the total number of events will be incorrect. The 
extent cd the error can be investi^ted. If as before, 

Pi=the probability of the registrar detecting an event 
gi=the probability of the registrar it 



BSnifATINa BIBTH AlO) DEATH KATES 


105 


then the probabilities in the 4 groups will be ^own by the accompany¬ 
ing table, which defines four new probabilities, Pn, Pa, Q 21 , Qa- V and 
q are always complementary: P 2 i+gsi=pai+g 2 i=l. 


Group 

Probability 

C Detected by both 

piPn 

Ni Detected by registrar only 

Piqn 

N» Detected by interviewer only 

gi2>22 

Y Missed by both 

qm 


If there is complete independence between the events missed by both 
B and I, then p*i=p* 2 =pa, introduced previously, and 
When there is dependence the expected value of the estimate of the 
number of events Y missed by both B and I will be close to 

i^PigaigiPa 

PiPa 

whereas the correct value is Nqiqa. The difference is 


Npiqaqipa 

PiPa 


NqHn = Nqi 



So if pu>pa, the total number of events is underestimated and if 
Pxi<Pb, tilie converse. We surmise that p 2 i>P 2 s is likely to be the case. 

Similarly, in the case of dependance, the registrar’s performance is 
estimated as pipai/(pipn+g4^ instead of pi, the difference bmng 
(pii—H P»i>Pa the registrar’s performance is 
overestimated and ff pa<Pa, the converse. 

n 


Pi = .8 

II 

Pn = .6 

ga “ .4 

Pb = .4 

g» *= .6 


the bias in the estimation of the total number of events will be 


gi 



— .067 or — 6.7 per cent. 


This bias may be much more important than the standard error of an 




106 


AUEBICAN STATISnCAli ASSOCIATION JOCItNAIi, lAABCH 1949 


estimate of the total number of events made under the assumption of 
zero correlation. 

Method to reduce the effect of correlaMon. It is important to note that 
correlation signifies heterogendty in the population for it implies that 
events that fail to be detected do not form a random sample from 
the -whole population of events. This heterogeneity may arise only if 
there are differences in the reporting rates for different segments of the 
population, resulting in the group of fmlures being -weighted dispropor¬ 
tionately by the different segments. 

It therefore follows that the correlation can be minimized by di-nding 
tile population into homogeneous groups and calculating the total 
number of events separately for each group; then by addition getting 
the grand total. In order to put this suggestion into practice, let us 
confflder the difference between two estimates of the total number of 
events: i. by di-viding the population into homogeneous groups and 
estimating the events in each group separately, then forming a grand 
total; a. by treating the entire population as a unit. Let the population 
be comprised of k homogeneous groups, -with Ni events m the i-th 
group ( 4 = 1 , 2, • • •, k). Then let be the probability of the regis¬ 
trar detecting an event in the ^th group, and the corresponding 
probability for the interviewer. The expected value of the number of 
events missed by both m the i-th group is and for the entire 

population the total missed by both -will be As by defini¬ 

tion there are only k homogeneous groups, this value -will be estimated 
without bias when the groups are treated separately. But if the entire 
population of events were pooled, the expected value for the estimate 
of the number of events missed by both would be close to 

[E Af.-pi<»>g9<a] [Z 

The difference in the two values -will be 

where 

EA%[pi»> - 
ZATi 

'Em 


5 ** 



SSTIMATIKG BIRTH AND DEATH RATES 


107 


N^TNi fi = 

and 

iSn 

SiS, 


§2 == 




Eiy<[pi<*^ - - h\ 

Si-s,E^. 


is the corrdation coefficient between and ps^®, wd^ted by JV„ 
the number of events to which they have reference. If r>0, then treat¬ 
ing the entire population as a unit, wc are led to an imderestimation 
of the number of events missed by both parties and therefore an un¬ 
derestimation of the total number of events. This also results in an 
overestimate of the extent of registration. If this is the case, the popu¬ 
lation need be divided only to the stage when further division shows no 
increase in the total number of events. It should be possible by actual 
trial witli some real data to decide whether (e.g., in computing number 
of deaths) 5-year a^ groups are a more effective subdivision than 10- 
year age groups; and whether infant deaths ^ould be treated sep¬ 
arately. 

The enquiry in Smgur Health Centre. The Singur Health Centre con¬ 
sists of four contiguous Union Boards,* viz., Singur, Balarambati, 
Bora, and Begompur, mtuated in the Serampore subnlivMon of the 
Hoo^y district. The village Singur which serves as the headquarters 
is only 21 miles away from Calcutta and is easily accessible by rail 
from Calcutta. The totid area of the Centre is about 33 square miles 
and comprised of 68 villages with a total population of about 64,000 
distributed over 12,000 families living in about 8,300 houses. As is usual 
in West Bengal, the villagers live close together in a compact block and 
wide fields separate such blocks. Since 1944 this area has formed the 
controlled practice field of the All India Institute of Hygiene and Public 
Health, Calcutta, for their experiment in Public Health Methodology. 

Procedvarefor regisiroHon. The procedure for the registration of births 
and deaths in this area follows closely the method adopted in other 
parts of Bengal The Chowkidar, i.e., the village headman, is the re¬ 
porting agent and is required to submit periodically to the Sanitary 
Inspector,* who is the registrar of the area a list of births and deaths. 


* Tile Bengal Brovinoe is diifided into divisions, the divisions into distriets, the dlstiiets into 
subdivisions, the subdivisions into thanas, and the thanas into Union Boaids. 

sA Sanitary Inspector is usually in diarge of the health activities of a t h a na . 



TABLE I 

THE INVESTiaATOBS' RBPOKT ON THE 00MPABI80N OP THE B AND I LIHTS OF fflNQOB HEALTH CSENTRE 


108 


AMEBICAN STATISTICAL ASSOCIATION JOTTBNAL, llABCH 


1 

III 

: 1 s 

1 1« 

III 
« 1 - 

741 

1,009 

372 

421 

•S 

.a 

*1 

flD 

1 1 

1 « 

1 

1 

1 1 

Number 

inoorroot 

S8 

1 189 

sg 

60 

No. non- 

verifiablo. 

illoeible, 

inoomplote 

etc. 

166 

i 

190 

117 

Number Verified 

Extra Ni 

not found 

in Inter¬ 
viewer's 

lists 

710 

1 

786 

733 

427 

Common C 

found in the 

the inter¬ 
viewer's 

lists 

794 

1 

1-€ 

860 

1 

430 

Total 

1 

1 

09 

1,083 

866 

Total 

number o£ 

evente in 

the liBta 

1 

rH 

2,669 

s 

eo 

1,062 

Year 

1946 

I 

f-4 

1045 

1046 

Event 

Births listed as ooourring in the 

village (excluding non-resident 

institutional) 

1 : 

® s 

OB it 
A *5 

|1 

II 

institutional) 



ESTIMATING BIRTH AND DEATH RATES 


109 


With, a view to improving the registration in this area, the voluntary 
services of a villager have been enlisted. He is not only expected to 
assist the Chowkidar, who may be illiterate, by making entries in the 
ChowMdai’s register, but also to inform the registrar directly on all 
births and deaths in the village. The registrar also obtains a list of 
births, maternal and infant deaths as known to the Maternity and 
Child Welfare Department, and by co-ordinating the information from 
the three sources is expected to improve birth and death registration. 
For all practical purposes the voluntary agency began operating only 
from January 1946. 

MeOiod oj enquiry. The enquiry in the Singur Centre covering 1945 
and 1946 was started on the 17th February 1947. The field work lasted 
for eleven weeks. In this enquiry an interviewer called on every house¬ 
hold to enumerate the resident population (separately as present and 
absent) and vititors with particulars of community, age, sex, and 
marital status, and to list all births and deaths which occurred in the 
village dmii^ 1945 and 1946, listing separately with relevant particu¬ 
lars those that occurred outside the Singur Health Centre. The lists so 
prepared are the I-list which, as was mentioned earlier, were compared 
with the registration books (the B-list). In the field-organization as 
actually employed, there were four investigators who worked at the 
comparisons and supervised the work of the 16 interviewers. The inter¬ 
viewers and the investigators were selected from the village popu¬ 
lation as it was thought that they would be able to obtain better co¬ 
operation than an outsider. 

It ^ould be emphasized that the comparison of the two lists is cruciaL 
The establishment of the identity of two entries, one on one list and 
one on the other, sometimes requires extreme perseverance. In some 
cases the re^trar's entry is by hearsay, and part of it may be wroi^, 
and often much consultation is required. The interviewer’s entry, how¬ 
ever, is fortunately accompanied by a house-number or other means of 
identification by which the information may be vmified if necessary. 

Basic data obtained from the enquiry. Table I shows the results of the 
invest^tors’ comparisons of the R and I-lists. As mentioned earlier, 
there are some problems ariring from illegible and incomplete entries, 
the movements of the population and institutional births. The table 
^ves some idea of the magnitude of these problems. For example the 
non-verifiable entries on the repstrars’ lists run to rou^y 10% or more 
of their total entries. In view of their magnitude the assumption that 
the unverifiable entries are a representative sample of all entries, an 
assumption that will be made in the calculations, becomes all the more 



110 jUCEBICAIT statistical association JOCKNAL, UABCH 1940 

important. The need of more careful repstration in the future is appar¬ 
ent. 

No separate account Tras maintained of the number of correctly 
registered events occurring in families that had m^rated out of the 
village prior to the interviewers’ survey. The assumption will be made 
that the registrars would have recorded this category to the same de¬ 
gree as for the non-migrants, but the number is small and under the 
conditions of the Indian village, this assumption is not important. 

In this enquiry the non-resident institutional births and deaths are 
considered separately and excluded from the table, as indicated. Insti¬ 
tutional facilities exist only in the Singur Union Board. The number 
of the institutional births to non-residents was about 8% in 1945 and 
1946. The number of institutional deaths of non-residents was only 
about 3%. 

EsUmaHon of total birOis and deaOts. In order to investigate the 
homogeneity of smaller groups comprising the whole, so as to arrive at 
the best estimate of the total number of events, calculations were 
carried out— 

i. for the Centre as a whole (births and deaths) 

ii. for each Union Board separately; then these figures were 
combined (births and deaths) 

iii. for males and females separately for the Centre as a whole; 
then these figures were combined (deaths only) 

iv. for age groups by sex for the Centre as a whole; then the 
figures were combined (deaths only) 

In 1945 the total number of deaths as estimated by these four meth¬ 
ods were 2234,2238,2245, and 2418 respectively each with a standard 
error of approximately 70. In 1946 the number of deaths as estimated 
by the four methods were 1,696,1,684,1,698 and 1,765, each with a stand¬ 
ard error of approximatdy 40. The doseness of the first three estimates 
indicates that the chances of the registrar and the interviewer detecting 
an event did not vary to any marked extent between Union Boards 
and the sexes. The increase obtained by the fourth method dearly indi¬ 
cates that the chances of the interviewer and the registrar detecting a 
death may differ considerably with the age of the dead person. Positive 
correlation is indicated. 

Higher percentages of deaths in the younger age-groups were missed 
by both R and I as compared with adult age groups. The proportion 
missed also drow a tendency to increase in the more advanced age- 
groups. It would be interesting to ascertain whether the estimate co^d 
be increased stUl further by finer subdivision of age groups or by sub- 



ESTIMATINa BIBTH AND DEATH BATES 


111 


division in regard to ol^er characteristics within each group, but no 
further analyses were conducted. 

As for births, the total number estimated from the data of the entire 
Centre was 2908 for 1945 and 3744 for 1946. Separate estimation for 
the four Union Boards when totalled yields 2915 and 3775 for the same 
years. It is to be noted that while the latter figures are the hi^er of the 
two, the figure for 1945 is higher by only l/7th of the standard error 
and the figure for 1946 is higher by a whole standard error. 

The highest figure obtained by breaJdng the population into groups 
in various was^s, and adding the estimated number of events, is to be 
accepted as nearest the true %ure. The nonresident institutional 
events, whidi were left out of consideration may be added in to get the 
total number of events occurring in the area. 

E^maMon of rates and incompleteness of registration. For computing 
birth and death rates over an area, the population base is furnished by 
the house-to-house canvass. The total number of correct entries in the 
Brlist judged against the total estimated number of events, measures 
the extent of registration. Tables II and III ^ow the results obtained 
for rates and for completeness of registration. 


TABI.E n 

BIBTH AND DEATH BATES IN 1945 AND 1946. SINQUB HEALTH CENTBE 



1945 

1946 

Bate 

Standard 

error 

Bate 

Standard 

error 

Birth rate per 1,000 population 

46.1 

0.8 

59.8 

1.0 

Death rate per 1,000 population 

37.7 

1.2 

27.5 

0.7 

Specific death rate (males) 

36.4 

1.6 

27.3 

1.0 

Specific death rate (females) 

39.2 

2.1 

27.8 

1.0 


TABLE m 


FEBCENTAGE OF BIBTH AND DEATH BEOISTBATION DUBING 1945 AND 1946 



Birth registration 

Death registration 

Union board 

1945 

1946 

1945 

1946 

Singur 

Balarambati 

Bora 

Begumpur 

60.4- 67.9 

51.5- 55.8 
53.1-61.3 
47.^0.3 

70.9-77.1 

53.3- 57.8 
56.0-66.0 

61.3- 64.7 

38.1-46.9 

45.8- 55.9 

54.9- 66.5 
42.6-46.4 

42.0-49.1 

50.8- 58.0 
52.6-63.4 

44.9- 48.1 


Note (1) The raoge is due to non-verified entries on the ^list. 

Note The figures are subject to a standard error of about 3 per cent. 




























112 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MAKCH 1M9 

One comment may be made in regard to the birth rate for 1946, which 
appears to be very high. Possible explanation may be the improved 
economic situation after the famine of 1943, and demobilization. An¬ 
other possible ^planation is failure of the investigator to establish the 
identity of entries in the R and I lists, but if this were so, it ^ould be 
more apparent for 1945, which it is not, as the birth rates for 1945 are 
much lower. An improbable explanation is that each Union Board is 
composed of extremdy heterogeneous sections displaying negative 
correlation between the probabilities of detection of events by the 
Re^strar and the interviewers. 

Another comment should be made. The completeness of registration, 
recorded in Table III, is based on the number of correct entries in the 
B^list judged against the estimated total number of events. Official 
publiffied rates in all countries are based on the total number of regia- 
trars’ entries, correct plus incorrect, and the usual practice of inflating 
official rates to correct for incompleteness of registration yields spiuious 
results: the rates are already partly inflated owing to incorrect entries. 
Proper inflation (correction of rates) is possible only by comparing the 
registration lists with the results of a population survey and making 
estimates of the total number of events and the proportion of incorrect 
entries in the registration lists. 

The precitim of estimated number of events. From the fact that the 
co efficient <rf variation of a total estimated number of events is 
V 9 i 9 t/Np]Pt, it will be seen that the lower the efficiency of detection of 
an event on either the I or R-lists (pt or pt), the greater the standard 
error of the total. In this enquiry, in spite of the fact that local people 
were hired and trained especially for this work, the efficiency of the 
interviewing was not of high order: only 67.2% of the births in 1946 
and 52.8% in 1946 were detected by the interviewers. The correspond¬ 
ing percentages for deaths were 50.7 and 32.3. Alethods of improving 
the performance of the interviewers must be sou^t, and it appears 
that the interval of time to be covered by the survey must not extend 
too far back. 

It is highly important to bear in mind that regardless of the inter¬ 
viewers’ performance, the method proposed here for estimating the 
total number of, N, events is not subject to bias,* but poor performance 
does increase the error of the estimate of N. It ^so increases the stand¬ 
ard error of the estimate of the repstrars’ performance. 

The coefficient of variation is also influenced by N. It is important 
to note that AT in the formula refers to any total—^not just a total over 


t Jn TnaWng tliis statenmit the ease of pt (or pi) «0 is considered triTial and is ezcliided. 


semvATma bibth and dhath batbs 


113 


an area, but a total for any subgroup, such as an age or sex classes for 
which an estimate is prepared. For the area and sex classes that were 
used here, the standard errors of the estimated totals varied from 1 to 
10%. Over a larger area, or over broader classes, the coefficients of 
variation would be reduced by the presence of the factor y/N in the 
denominator. 

Costs. A few words regarding the cost of this particular enquiry may 
be helpful in planning future enquiries. The cost of the field-work, in¬ 
cluding salaries and overhead charges, amounted to Rs. 4,000. The 
cost of tabulation and anal 3 rsis amounted to Rs. 1,500. The total cost 
was thus Rs. 5,500 or about 1| annas (2 TJ. S. cents) per capta in the 
area of enquiry. For various reasons (this being a pilot study and a com¬ 
plete listing of the population being desirable for other reasons), the 
entire population was covered without the introduction of sam p lin g . 
In designing an enquiry for a larger area such as a province or even a 
district, sampling would be used. 

For each area in the sample there can be calculated the total number 
of events and the rate: also the efficiency pi of the registrar’s perform¬ 
ance. For each sample-area, supposedly completely canvassed (no 
sub-sampling) there will be an error in estimating either the rate or the 
registrar’s performance. The coefficient of v ariation in t he rate will be 
the expression already given earlier, viz. Vffigs/JVpiPa Likewise, the 
coefficient of variation of the estimate of pt, the registrar’s performance, 
is 

/ Yi N-C-Ni 

V \C + N^pi N-1 

Each symbol refers to the particular area covered. These errors are not 
erased by taking a complete canvass. (As a matter of fact, the particu¬ 
lar enquiry described here was a complete canvass, yet subject to these 
errors.) 

When sampling is introduced to study a whole District, the estima¬ 
tion of the total number of events, the rates, and the over-all effidency 
of registration will be made by combining the data from a number of 
sample-areas. An additional error is then introduced for a District as a 
whole because of variability between the sample areas. The variability 
between the rates of the individual sample-areas may be much smaller 
than the variability between their total events, as it is usuiffiy difficffit 
to define sample-areas of equal populations. It follows that usually a 
mudi smaller sample will provide a standard error of (e.g.) 4 per c^t 



114 


AMBBICAN STATISTlCAIi ASSOCIATION JOITBKAL, MARCH IMS 


in an overfall rate for a District than is required to provide the same 
precision in the total number of events. 

The cost of attaining (e.g.) a 4% error of sampling ivill depend on the 
particular design of sample that is used; and the design in turn will, for 
greatest economy, depend on the density and distribution of the popu- 
ktion, on the variability of the birth and death rates over the area for 
which estimates are to be prepared, on the costs of purchasing or pre¬ 
paring maps and lists by which the sampling procedure may be for¬ 
mulated, on the quality of personnel available to cany out the work, 
etc. 

As a general principle, applicable to lai^ populations, so far as the 
errors of sampling are concerned, the total number of cases (i.e., the 
total number of people, households, areas, or whatever imit constitutes 
the elements of sampling) to be included in the survey depends almost 
entirely on the precision of sampling that is desired in the estimation of 
the total number of events, or in the rate (whichever is the aim of the 
survey) and hardly at all on tAc foial number of inhabUonts in fke area to 
be covered^ 

In India, the birth and death rates should be estimated at least by 
the District (roughly 1 to 2 million inhabitants), and for smaller areas 
if funds would permit. Roughly speaking, to attain an overfall 
standard error of 5% (a reasonable aim for the present), the cost of a 
survey will run between Rs. 10,000 and 15,000 for a district. 

Additional information provided. A survey of this type also provides* 
valuable ancillary information regarding other characteristics of the 
popularion such as size of family, age and sex distribution, marital 
status, occupation and industry, specific fertility rates, gross and net 
reproduction rates, and other information, but the list cannot be ex¬ 
tended indefinitely because the interest of the field workers must not be 
disripated too far from the main aims of the survey. 

AFPBNniX 

THE STANDARD ERROR OF N 

An approximate value for the standard error of ff. 

^ (.C + N{)(,C + Ni) 


* It is presomed in tliis statement that the physicsl fadlitieB for Mvnpitwg (xnaps, lists, personnel, 
payment, etc.) are about the same over all parts of the area to be covered. 

» As a matter of fkct, the surreys reported have provided most of l^iese additional items, and the 
cost mentioned indudes them. 



estimating birth and death rates 116 

can be obtained by the application of the formula that the variance 
Vf(x) of a function/(a;) of x is approximately given by 

(S>> 

where ( )e denotes the substitution of the expected values for x 
appearing inside the bracket after differentiation, and V(x) denotes the 
variance of x. 

If C+Nif C+N 2 and N are fixed, it is known that the expected 
value E{C) and the variance V(C) of C are given respectively by 

E(C) =* N^iPi 

and 

V(C) Npiqip2q2 

where 

C + Ni C + N^ ^ ^ 

N iV 

Under the same conditions, the variance V(ft) of is 

) 

which by the application of the formula given above reduces to 

PiVi 

The standard error of JV is therefore 



F(Jv) = (c + Ni)Kc + myv 




approximately. 



EVALUATION OF PAEAMETERS IN THE 
GORIPERTZ AND MAKEHAM EQUATIONS 

J. F. Bbennan 

This paper describes a technique for determining the mortal¬ 
ity characteristics of pliysical property through the use of ac¬ 
counting records of plant balances and yearly additions. 
Where time is not available for extensive actuarial research, 
the method produces results within tolerable limits of accu¬ 
racy. Its limitations are pointed out. 

T he assembly of the statistics needed for an actuarial analysis of 
physical plant is a tedious and expensive task. Basic records are not 
always adequate for this purpose, a circumstance which poses a for¬ 
midable research problem. The technique described herein was devised 
in an effort to b 3 ^ass this obstacle. The application of the method re¬ 
quires onlj^ a money record of plant balances and of gross additions over 
a period of years. 

Evidently the plant balance at any time is the summation of the 
survivors from the gross additions of all previous years. If, using this 
principle, we attempt a direct determination of the parameters for a 
Gompertz or RIakeham equation, we immediately encoimter the diffi¬ 
culty of making summations, because of the compound exponential 
property of these fimctions. We may, however, expand these functions 
into convergent power series and by disregarding terms beyond the 
second or third power, often find an acceptable solution. In the de¬ 
velopment of the theory in the following paragraphs, I take the case of 
the Gompertz equation and its power series, limited to the square term. 
The extension of the technique to the Makeham equation and to higher 
terms of the series expansion will be obvious. 

The specific problem is to determine, on the basis of a set of data, 
the parameters of a Gompertz equation: 

y = 

where t =age and y =survivors at age t. 

The data are given in the form: 

Pi = Yn 

■P2 = Yu + F21 

P3 = Yu + Y22 + Yzi 

Pi = Yu + F23 + F32 + Fa 

where P 4 is the plant balance at the end of year four and F 2 S is the 

116 



BVAIiTfATrON OP PABAMBTBRS 117 

amount of plant remaining from the gross addition of year two, 8 years 
after installation, etc. 

For amplicity in the development, there will be imposed the condi¬ 
tion that at age zero, the survivor curve passes throu^ point (0, 1), 
so that we show at each age the fraction surviving. Thus we write: 

1 

y = —(7“. 

Q 

Expanding y in a Maclaurin series gives: 

y^l + OCt + ^GC^iO + l)fi + ^GCKO^ + ZG + l)t» 

2! o! 

where G =log« g and C =log. c 

Now if, as found in some applications, the values of G and C are such 
that over the range Fu to Yu the sum of the cubic and higher order 
terms is small compared to the sum of the first and second order terms, 
then for that range the Grompertz equation may be approximated by 

y = 1 — of — bt*. 

If coefficients o and b are determined from the data, the parameters 
of ihe Gompertz Equation are given by: 


( 1 ) 


G = - 
C = - 


a^ + 2b’ 
a 

g’’ 


To determine the values of a and 6 we proceed as follows: Let 

(2) 1.00 — o(n — z) — b{n — x)* 

where F,.„_*.=the survivors in year n, as a decimal part of the in¬ 
stallations made in year x, (i.e. (n—x) years after 
mstallation), 
a:=year of installation, 
n=s. particular year, subsequent to year x, 

(n—x) =age at year re, 

a&baxe constants to be determined. 

If we sum up the survivors in year re out of the mstallations of all 
previous years, we have 

■Pn+i = 22 — a(n - x) - b(n a:)*] 


( 3 ) 



118 


AimtlCAKT STATISnOAIi ASSOCIATION JOUBNAli, MASCH 1049 


where = gross additions made in year * 

P^i =capital in service (i.e. plant balance) at the end of the 
nti year, (be^nning of yearn+1). 

Expanding (3) we obtain 


Pf+i “ S - o[«£ d. - 23 ■^®1 

— 6[n®]^ A — 2n23 -da: + Ax*] 


or 

a[n23 -4 — 23 + 6[»*23 -d — 2n23 -d® + 23 -^®*] 

Now evidently values may be fixed for a and 6 by choosng any two 
different years n, and solving eomultaneously the two resulting equa¬ 
tions (4). Thus many different sets of values for a and b can be obtained 
by varying the selection of the two n’s. A least squares solution based 
on a range of values of n is indicated. 

For convenience let 

Un= [nEA - 'EM, 

V» =» [«*E ~ 2nE + 23 ^®®], 


Then Equation (4) may be written 

(5) Zn = aUn + 6F,. 

From a set of equations (5) the best values of a and b ("best” in the 
lease squares sense) are found from the normal equations: 

.. (aEu* + bEuv=^Euz, 

XaEUV + bEV^ = EyZ, 


where the summations are carried out over such a range of values of 
n as is appropriate to the approximation. 

With the numerical values of a and b obtained by solving equations 

(6) the survivor equation (2) may be conveniently written 

(7) r, - 1 - of - bt* 

where f=age=(n—a;), Yt— survivors at f as a decimal part of the 
additions made in year 0(Fo«1.0) 

The values of G and C (and from them g and c) may now be obtained 
by substituting a and b in equations (1). The resulting Gomperta 



BVAlitXATION OF PABAMETEBS 


119 


curve and the parabola, equation (7), ^ould then be plotted on the 
paTOA graph. It may be found that the two curves e^bit important 
divei^nce in the higher years of the range employed in the solution. 
This indicates that the third order and hi{dier terms of the power series 
^ould not have been ne^ected in Equation (2). To include falser 
order terms, however, would multiply the work of the solution tre¬ 
mendously, as wiU be obvious from the summations of equation (4). 

A sati^actory way out of this difficulty consists in calculating a 
number of points on the curve. Equation (7), and to them fitting a 
Gompertz curve by King’s method.^ The range of points employed in 
this procedure should be approximately the same as that used in deriv¬ 
ing constants a and b. 

Chart I illustrates the processes heretofore discussed. The two 
curves ^own thereon are calculated from the same actual book record. 
First the constants in the parabola were fixed by the method proposed 
herein. Two Gompertz equations were then found: one by substituting 
the parabolic constants in Equations (1); and the other by the log¬ 
arithmic method of l^ng. The latter method appears to give the better 
interpretation of the data, since, as seen on Chart I, it produces a curve 
coincident with the i)arabola over the range of the ^ta employed. 

How well the Gompertz equation, fomd by King’s logarithmic 
method, defines the mortality characteristics of plant, may be judged 
from the results given in Table I, which diows how nearly the theo¬ 
retical capital in service, calculated from the derived Gompertz equa¬ 
tion, matches up with the actual book record. The example used is 
based on the same actual data as underlies Chart Z and represents fixed 
capital investment in overhead electric conductors, to which yeariy 
additions were erratic, i.e., not correlated with the amount of cajatal in 
service. The test is, therefore, especially rignificant. The standard error 
of estimate calculated from Table 1 is 7.7 or about | of 1% of capital in 
service during the last 5 years of the table. 

A further test of this technique was made on a group composed of 
thousands of homogeneous units of equipment (gas meters). Having 
previously, by the conventional actuarial process, derived a Gompertz 
survivor curve for this group, a sample accormting record of gross addi¬ 
tions was assumed and a plant balance record calculated from the 
known Gompertz curve, covering a period of 30 years. The process de¬ 
scribed herein was then applied to the sample, Kong’s method bdng 
used for the extrapolations. The results are depicted on Chart n. 


^ See 'Textbook of the Inetitute of Actviaries* or IVixifrey 'Statistical Analyses of Induscrial 
Property Betuements* Bulletin 125, Iowa Engineering Experiment Station, Ames, Iowa. 




SURVIVORS, % OF RADIX 


120 


ALBICAN STATISTICAL ASSOCIATION JOTJENAL, lAARCH 1949 


L 



fi&e - YEA/fS 

CHART I 


ORIGINAL GOMPERTZ CURVE 


AVERAGE LIFE 30.4 YEARS 
—GOMPERTZ CURVE DERIVED FROM 
ASSUMED SAMPLE PLANT BY 



AGE - YEARS 

CHART n 








BVAIilTATION OF FABAUETBBS 


121 


TABUB I 

TEST OP DEBIVBD SURVTSrOR CURVE 


Year 

Capital in service at beginning of year 
(000 omitted) 

Deviation 
(absolute value) 

Actual 

Theoretical 

1935 

$2,817 

$2,840 

23 

1936 

2,862 

2,856 

6 

1937 

2,916 

2,910 

6 

1938 

2,972 

2,963 

9 

1939 

3,097 

3,098 

1 

1940 

3,161 

3,172 

11 

1941 

3,223 

3,229 

6 

1942 

3,320 

3,339 

19 

1943 

3,391 

3,390 

1 

1944 

3,457 

3,443 

14 


The true average life is 30.4 years, while that derived by the proposed 
method is 30.2 years, giving an error of approximately six tenths of one 
per cent. This is a tolerable deviation in this type of problem. No doubt 
greater precision could have been achieved by the use of a cubic (or 
higher degree) equation in lieu of the second degree parabola, but the 
extra labor is not justified in this case. Some precision was sacrificed 
in roimding out the data to units of one thousand dollars. Greater re¬ 
finement is probably not warranted in the use of this process. 

In applying this method, one may be confronted with a reliable sta¬ 
tistical record covering a period of years, but may lack knowledge of 
the age distribution of the plant balance at the beginning of that period. 
Frequently in such cases a satisfactory solution can be made by esti¬ 
mating the average age of the beginning plant balance and treating it 
as a gross addition in the year corresponding to that age. Such a scheme 
has obvious frailty and should be employed only when no more lo^cal 
process is possible. 

It is conceivable that the derived parabola may have a maxiTnum at 
an age significantly greater than zero. In some applications also, it 
may be found that the parabola is concave upward. Such results indi¬ 
cate a lack of stability in mortality ratios in the period (band) of years 
employed. This condition is entrained by shifting retirement polid.es, 
changed economic conditions, lack of replacement material or replace¬ 
ments and additions made by substitution of materials or equipment 
having inherently different life characteristics. Sometimes these faults 
will vanish with the selection of a different band of years (range n) 
for solving Equations (6). If such conditions perdst, however, the 
method breaks down. 













ON THE “INFORMATION* LOST BY USING A t-TEST WHEN 
THE POPULATION VARIANCE IS KNOWN 

John E. Walsh 
The RAND Corporation 

This note calls attention to the use of the power function 
as a means of determining how much "information” is lost 
by using some other test in place of the most powerful test 
of a given hypothesis. As an example of the method, the case 
of using a t-test for the mean of a normal population with 
known variance is analyzed. 

INTRODUCTION 

I p TWO significance tests of the same hypoliiesis i^ould happen to 
have the same power fimction, these tests would furnish the same 
amount of “information* about the hypothesis tested in the sense of 
the Neyman-Pearson theory of testing hypotheses. Of course, it is 
hardly to be expected that two different significance tests will have 
exactly the same power function. In some cases, however, two signifi¬ 
cance tests may have very nearly the same power function. Then the 
two tests are said to furnish approximately the same amount of “in¬ 
formation* concerning the hypothesis tested. 

If there exists a most powerful test for a given hypothesis, “informa- 
tion” (m the sense of the Neyman-Pearson theory of testing h 3 rpothe- 
ses) will be lost by using some other test rather than this most powerful 
test. (For fixed sample size and significance level, a test is most power¬ 
ful if the values of its power function are greater than or equal to those 
of the power function of any other test of the same hypothesis for the 
particular alternative considered.) It may happen, however, that the 
most powerful test (at same significance level) has approximately the 
same power function as the given test if the most powerful test is based 
on a smaller sample size; i.e., a most powerful test based on m sample 
values furnishes approximatdy the same amount of “information* as 
the given test uting n sample values (m^n). Then it will be said that 
n-m sample values are “wasted* or “lost* by using the given test rather 
than the most powerful test. By convention, the value of m is allowed 
to assume non-integral values; the values of the power function of the 
most powerful test for non-integral m are foimd by interpolation from 
the power function values for integral m. This procedure furnishes an 
intapolated measure of the number of sample values “lost.* 

The above procedure could also be carried out in terms of operating 
characteristic functions rather than power functions. Since 


122 



“intoemation’’ lost bt using a t-test . . . 


123 


(1) CPower Function) »= 1 — (OC Function), 

however, the same value of m is obtained. 

The value of 100m/n% is called the potoer efficiency of the given sig¬ 
nificance test. A discussion of power efficiency which contains an exact 
definition of when two power functions are to be considered equivalent 
(in the sense of furnishing the same amotmt of ‘‘information’’) is given 
in [1]. From (1), the definitions and remarks of [1] are equally applicable 
to the case in which OC functions are used instead of power functions. 

As an example of application of the above metiiod, let us consider a 
sample from a normal population with unknown mean and known 
variance. If it is desired to test the population mean with respect to a 
givm constant value, the most powerful one-sided and symmetrical 
tests are based on the quantity 

(sample mean) — (^ven constant value) 

(population standard deviation) 

Thus, if the Student t-test is used instead of (2), “information” will be 
lost. This note presents an approximate expression for the mnnber of 
sample values “lost” for the cases of one-sided and symmetrical t-tests. 

The example analyzed has statistical interest in itself. Many statis¬ 
ticians have probably wondered how much information is lost when 
this situation occurs. One possible application of the result would be 
to help in deciding whether to use the t-test or test (2) with the popu¬ 
lation standard deviation estimated from past information. The final 
decision, of course, would also depend on the reliability of the estimate 
of the population standard deviation, the cost of taking observations, 
and perhaps other considerations. The complete formulation and analy¬ 
sis of this situation, however, is not considered to be a problem of this 
note. 

Situations similar to the example analyzed here were investigated 
by Neyman in [2] and by Fisher in [3]. Rsher’s results, however, are 
based on estimate rather than power function considerations. 

2. ResiUts for example: Let n sample values a;i, • • ■, a:« be drawn 
from a normal population with unknown mean n and known variance 
Let us contider tests of whether p differs from a ^ven constant value 
Po which are based on the t-statistic 

_ (£ — iitWn(n — 1 ) 

VZ) “ *)* 

All the t-tests investigated in tiie note are of this type. 



124 


AMEHICAN STATISTICAIi ASSOCIATION JOURNAL, MARCH 1949 

For one-sided t-tests at significance level a, or a symmetrical t-test 
at significance level 2a, it is found that approximately 

iKaW(n - 1 ) 

sample values are ‘^wasted” by using a t-test rather than the cor¬ 
responding test of type ( 2 ). Here Ka is the standardized normal de¬ 
viate exceeded with probability a; i.e., the function Ka is defined by 

(3) “7=: I = a. 

V2TJKa 

The above approximation to the number of sample values “lost” is 
reasonably^ accurate for 71^4 if a=5 per cent, if q'= 2.5 per cent, 
71^6 if per cent, 71^7 if a=0.5 per cent. The accuracy of the ap¬ 
proximation increases as n increases. 

If 71 is not too small, the above results can be roughly summarized 
by stating that sample values are “lost” by using a one-sided t- 
test at significance level o; or a symmetrical t-test at significance level 
2a, Table I contains values of iKJ for a=5 per cent, 2.5 per cent, 1 
per cent, 0.5 per cent. 

TABLE 1 

APPROX. NO. OP SAMPLE VALUES “WASTED* USING 
t-TEST WHEN VARIANCE KNOWN 


Sigmficanee Levd 

Approaimste No. of 

One-«ided i- test | 

1 Symmetrical l-test 

Sample Values “Wasted” 

6% 

1 10% 

1.4 

2.5% 1 

1 6% 

2.0 

1% 

2% 

2.7 

0.6% 

1% 

3.3 


3- Derivaiiom: Let us consider the one-sided t-test of /4<jLto at sig¬ 
nificance level a and based on a sample of size n. Using a modification 
of the normal approximation given in [4], it is found that the power 
function values e of the t-test are approximately determined by the 
relation 

K, = K, - — [1 - iraV2(n - 

where the Kj^ function is defined by (3). This approximation to the 
power function is reasonably accurate for 71^4 if a—5 per cent, 71^5 














“information’’ lost bt using a t-test . . . 


125 


if a—2.5 per cent, nS6 if a=l per cent, n^7 if a=0.5 per cent. The 
accuracy of the approximation increases with n. 

Now consider the one-sided type (2) test of m </to at significance level 
a and based on a sample of size m. The power function values e* of this 
test are exactly determined by the relation 


K: = Ka 


(mo — m) 


Hence the two one-sided tests will have approximately the same 
power function if m is chosen so that 


_ (mo - m) _ 
“* . /— — 


i.e.^ so that 


ify/m 


n 


. - [1 - ^.V2(n - 1)]^/*, 

a/^/n 

m = \Kt?n/{n — 1). 


Thus approximately n/(n—1) sample values are “wasted*^ if 
the one-sided t-test of ix<iio at significance level a and based on n 
sample values is used rather than the corresponding type (2) test. 

By symmetry, approximately n/(»—1) sample values are also 

“lost” by using the one-sided t-test of /x >jUo at significance level a and 
based on a sample of size n. 

Now the power function of the symmetrical t-test of ii 9 ^no at sig¬ 
nificance level 2 q: and based on n sample values equals the sum of this 
power function of the one-sided t-test of /x</xo with significance level 
a and sample size n plus the power fxmction of the one-sided t-test of 
at significance level a and sample size w. Likewise the power 
function of the symmetrical type (2) test of at significance level 
2a: and based on a sample of size m equals the sum of the power func¬ 
tions of the two one-sided type (2) tests (of <jLio and /x>Mo) at signifi¬ 
cance level a and sample size m. Thus approximately iKa^ n/{n—l) 
sample values are “wasted” by using a symmetrical t-test of /x?^/xe at 
significance level 2a and based on a sample of size n. 


BEFEBENCES 

[1] John E. Walsh, “Some significance tests for the median which are valid under 
very general conditions.” Submitted for publication in Annals of Math, Slot, 
Abstracted in Annals of Math, Stat., Yol. 18 (1947), pp. 610-611. 

[2] J. Neyman, “Statistical Problems in Agricultural Experimentation,” Jour, 
Boy, Stat, Soc,, VoL 2 (1935). 

[3] R. A. Fisher, Design of Experiments^ Oliver and Boyd, 1942, p. 231. 

[4] N. L. Johnson and B. L. Welch, “Applications of the non-central t-distribu- 
tion,” Biometrikat Vol. 31 (1940), p. 376. 



WESLEY CLAIR MITCHELL, 1874-1948 
AN APPRECIATION 

The death of Wesley Clair Mitchell brought to an end a lifetime 
of fonnative research, inspired scholar^ip, and earnest, continu¬ 
ous effort to apply scientific methods to social and economic prob¬ 
lems. The end was untimely not merely because it always comes too 
soon for those few useful and lovable m^bers of mankind of whom 
Dr. Mitchell was such an outstanding example. It was even more un¬ 
timely because to the very end Dr. Mitchell retained the keenness of 
mind, the breadth of vision, the hospitality to new and pseudo-new 
ideas, and the kindlinaHg to their often overconfident bearers—rare 
qualities even among scholars. To those of us who knew him and had 
tile privilege of meeting him often, he seemed ageless and timeless. It 
is still difficult to realize that he is gone and will not be here to listen 
to our enthusiasms and complaints, to comment wisely and always with 
a charming humor upon some new quirk of the human mind, and to set 
before impatient younger generations further examples of broad 
scholartiiip and of respect for data and problems. 

In these few notes it is perhaps most appropriate to stress Dr. 
Mitchell’s work in statistics. Of the many who are familiar with his 
writings in recent decades only a few may realize how consistent was 
his interest and how continuous his research in the field of statistics. As 
a young graduate student at the University of Chicago at the end of the 
1890’s, his interest aroused by the monetary questions of the day. Dr. 
Mitchell was already contributing to the enrichment of quantitative 
knowledge by a series of articles on prices and by his work on the infla- 
tion experience during the Civil War—^work that eventually resulted in 
two monumental treatises (published in 1903 and 1908). Upon comple¬ 
tion of his graduate training at Chicago (with one year in Germany and 
Austria), Dr. hlitchell spent 1899-1900 at the Bureau of the Census, 
when AUyn Young and Walter F. Wilcox were there. This early com¬ 
bination of fruitful use of data in the study of economic problems with 
active interest in public agencies responsible for social and economic 
statistics set a precedent consistently followed throughout his lifetime. 
As work on currency and monetary problems gradually gave way to 
the broader studies on business cycles. Dr. Mitchell continued to main- 
tiun his active interest in and scrutiny of the basic data. There followed 
articles on the BLS index numbers of wages {QJE, 1911), on new bank- 
ii^ measures {JPE, 1914), and on postible improvements m the sta¬ 
tistical output of federal bureaus (Quart. Pub., ASA, 1915). From bis 


126 



WESLET CLAIB MITCHELL 


127 


earliest productive years to the publication of that classical treatise on 
Business Cydes (1913) there was a continuous interplay of analj^, at¬ 
tempts at gathering and improving economic data, and efforts to raise 
the quality of basic information available to scholars and to the intel¬ 
ligent public at lai^. 

This concern with bringing measurable facts to bear upon basic eco¬ 
nomic problems and with the need for critical scrutiny and evaluation 
of data made available by public agencies persisted throughout Dr. 
Mitchell’s life. In 1916, barely two years after publications of Business 
Cycles, the BLS monograph on index numbers of wholesale prices ap¬ 
peared. This less well-known study, which was reprinted in 1921—an 
unusual distinction for a government report—^is also a typical example 
of Dr. Mitchell’s scholarship and approach—^in the care with which the 
efforts of earlier scholars in the field are reviewed and utilized, in the 
breadth with which the problem is conceived, m the scrupulous atten¬ 
tion paid to the characteristics of the available primary information, 
in the happy blend of insight and common sense with which the answers 
are provided and indeed the very questions formulated. 

During the country’s active participation in World War I, Dr. 
Mitchell served as Chief of the Price Section of the War Industries 
Board. Since he was always quite reticent about this period, one may 
surmise that it was not a happy one—^for reasons which many scholars 
who passed through a timilar experience in World War II can well 
understand. The pressure of urgent problems, the need for decisions 
made upon all too slim a factual basis, the tug and pull of various group 
and personal interests, hardly provided an atmosphere satisfactory to a 
scholar bent upon operating with wide and thoroughly weighed evi¬ 
dence. Yet several important and valuable results can reasonably be 
attributed to this experience. One was a clearer appreciation of the 
difficulties in the assembling of data and of research under government 
auspices—^with some prescient suggestions for change, forediadowing 
future reforms, made in Dr. Mitchell’s Fresidentud Address to this 
Association in 1918 (see JABA, March 1919). Another was the series 
of monographs on the history of prices during the war, of which two 
volumes appeared rmder Dr. Mitchell’s name. But perhaps the most 
important result of his war experience was the conviction that neither 
the university nor the govermnent sufficed as loci of objective study of 
economic problems; and that a research institution, combining the 
continuity, theoretical interests, and the broad approach of the aca¬ 
demic scholar with attention to quantitative data and the more real¬ 
istic approadk of government research, would plug a crucial gap and fill 



128 AMERI CAN STATISTICAL ASSOCIATION JOTJBNAL, MABCH 1949 

a badly needed want. It was this conviction that provided the initial 
impetus to Dr. Mitchell and some of his wartime colleagues in the or¬ 
ganization of the National Bureau of Economic Research in 1920. 

Dr. Mitchell’s own research since the early 1920’s is closely associ¬ 
ated with the National Bureau, of which he served as research director 
until 1946 and as an active member of the staff throughout and until 
the last. He headed the team that made the basic study of national 
income in this country in 1922 and set the pattern for work in the field 
that has grown apace ever since. It was his work on business cycles, in 
the increasingly broad conception of it as a pattern of change in the 
whole economy, that provided the central theme for all the work of the 
National Bureau through the almost three decades of its existence. It 
was his inspiration that held the National Bureau to standards that it 
endeavored to maintain; that attracted to it a group of people who 
combined theoretical interests with a zeal for established and testable 
e'v’idence; and that kept the National Bureau from the temptation to 
take hard and fast positions on cinrent and apparently pressing issues 
that were not warranted by the existing evidence. It was under the 
auspices of the National Bureau that Dr. Mitchell published his in¬ 
troduction to a new study of business cycles. Business Cycles: The Prob¬ 
lem and Its Setting (1927) and the treatise on Measuring Business Cycles 
(jointly with Arthur F. Bums, in 1946). A report dealing with stable 
and variant characteristics of bu^ess cycles, now in preparation for 
publication by the National Bureau, engaged his attention during the 
last three years of his life- 

impressive as is this list of contributions during the last quarter 
century, it is incomplete in several respects. Dr. Mitchell was part 
author of many of the National Bureau publications, either as a direct 
contributor (to Business Cycles and UnemphyTnent, Recent Economic 
Changes), or by his assistance rendered in review and criticism, or by 
the example and inspiration set by his own work. He served as guide 
and counselor to many other research organizations and projects—^as 
chairman of the Research Committee on Social Trends (1929-33), as a 
member of the National Planning Board and of the Nation^ Re¬ 
sources Board (1933-35), as a member of the Social Science Research 
Council (since 1927)—^to list but a few. The last service he rendered 
the government was as chairman of the technical committee set up by 
the National Labor Board to review the controversy over the BLS cost 
of living index (1943-44). And those who knew him were all too aware 
of how much of his time and effort was spent in counsel and guidance, 
kindly and modestly extended to all scholars, young and old, who were 
seeking it in increasing numbers. 



WESLEY CLAIB MITCHELL 


129 


The enriching influence of this lifetime of scholarship on research in 
the social sciences, on the teaching of economics and related subjects, 
and on public policy is well recognized and hardly requires demonstra¬ 
tion. The ever widening use of statistical data and tools in the analysis 
of economic problems, the emphasis on the concrete institutional and 
historical framework within which societies live and function, the more 
scrupulous distinction between a recorded observation and plausible 
assumption, are all comparatively recent trends in the economic and 
social disciplines in this coxmtry and elsewhere. To this quickening of 
the searching spirit in the study of society, Dr. Mitchell's own investi¬ 
gations and those directed by him, were a major impetus. It should 
also be noted, for the benefit of those who are concerned with direct 
and practical utility, that the return to society from such effort is far 
greater than may appear on the surface. Its value to society is not 
only the obvious one of making possible more intelligent solutions of 
social and economic problems because more is known about the fimc- 
tioning of the economy—a clear illustration of this was provided in the 
work of our economic agencies in World War II. Its even greater, if 
less obvious, value lies in the spread of the spirit of inquiry and of the 
respect for facts, which impose desirable limits on the "mutable Minds, 
Opinions, Appetites and Passions of particular Men.” 

Yet the study of society through the use of statistical and other 
testable evidence is far from an easy task. Those who have wrestled 
with the complexity and variability of observable economic life and 
with the imperfect and treacherous data available on social phenomena, 
know the coxirage, patience, and sheer moral stamina required in this 
struggle and the unusual capacities for organization, analysis, and syn¬ 
thesis needed to bring order out of chaos. It would be useless, and per¬ 
haps impertinent, to inquire by what turn of the wheel that determines 
heredity and environment was Dr. Mitchell endowed so richly with 
all these qualities. But it is important to indicate, as well as*one can, 
the leading ideas and the broad attitudes that assisted him throughout 
his lifetime. 

These ideas or attitudes may be briefly stated under three heads. 
First was the conviction that the human mind is infinitely productive 
of h 3 npotheses or models and that, from its rich endowment, it proceeds 
to originate them in profusion—^regardless of the extent to which they 
are anchored in testable evidence- This conception led Dr. Mitchell to 
approach the products of the human mind with both respect and cau¬ 
tion—^respect for the rich insight and small modicum of experience 
that they may embody, and caution in accepting the wide interpreta¬ 
tion and inference that are almost inevitably attached to the products. 



130 AaCBBICiL^ STATIBTICAli ASSOCIATION JOOBNAL, UABCH IMS 

There is a revealing discussion of this attitude in Dr. IMitchell’s letter 
to J. IM. Clark (see MeOtods in Social Science, edited by Stuart A. Bice, 
Chicago, 1931, pp. 674-80). It was an attitude particularly helpful in 
the held of social study in which group interests and passions tend un- 
conscioudy to color the hypotheses or models originated with such ease 
and with such a claim of finality. 

Second, and equally important, was the dominant notion of inter¬ 
relation in space and continuity in time as the basic characteristics of 
social and economic life. It is significant that the whole line of evolution 
in Dr. ^litchell’s work is from prices (under the specific an^e of in¬ 
flation), to business cycles, and to the study of economic change at 
lai^e. But this semblance of evolution is deceptive, since in his early 
investigations Dr. Alitchell already fuUy recognized that the study of 
{my one part is in effect the study of the whole from a particular angle. 
This biisic idea, in combination with the critical attitude toward 
man’s theorizing, resulted in a natural emphasis on testable evidence 
and on im approach that, however meticulous with reg{u:d to the parts, 
never lost sight of the whole. The first attitude m{ide Dr. Mitdiell an 
empiricist; the second made him a synthesizer in the best sense of the 
word. The first helped him to resist the temptation to escape into the 
quiet haven of imagmative models and ‘caeteris paribus’es; the second 
helped him to avoid refuge in the deteuls of empiric{d work imd kept 
him from indulging in the perfectionist’s delict of whittling at 
minutiae imtU the Greek Kalends. 

But there was a third and perhaps most basic idea—^that there w{is 
some order in the ceaseless change and variance of economic phe¬ 
nomena; and that the patient building up of testable qu{miitative data, 
accompanied by the cautious and critical use of theories {is hypotheses, 
might reveal the invariant elements. It is this idea that iUuminated 
Dr. Alitchell’s work with a steady glow, that served as a powerful 
magnet around which the detailed findings in his treatises arninged 
themselves in a comprehensible pattern. And it is the quest for this 
underlying order that provided the powerful drive in this long life of 
search and research—in the belief that as the pattern is gradu{illy re¬ 
vealed {md its concrete manifestations recognized, it will be accepted 
by human intelligence as the basis for action on socud and economic 
problems. 

Wesley Mitchell would have been the first to protest {igainst such 
analytds in what he would consider grandiloquent terms: he was a 
modest and humble man—with a humility that, like aU genuine humil¬ 
ity, veiged on pride. And I am writing these lines reluct{intly. My only 



WESIiET OLAIB MEDCHELL 


131 


justification is that it is important to realize what guiding ideas were 
helpful in a lifetime of fruitful and fimdamental scholarship. It is im¬ 
portant to recognize how strong today is the temptation to withdraw 
into the security of imaginary models, only distantly relevant to his¬ 
torical reality—^regardless of how mathematically elaborate such 
models may be. It is important to see how ever present is the opposite 
temptation—^to elaborate and check details without concern as to their 
place in the broader framework. And while these two lines of intellec¬ 
tual pursuit are moderately useful in the development of science, the 
third direction, equally tempting—^to give up hope of finding any in¬ 
tellectual order and to resolve the problem by withdrawal into esthetic 
fancy or intellectual C 3 aiicism—can be of negative value alone. Dr. 
Mit^ell’s life is an inspiring demonstration of how effectively such 
temptations can be combatted, and how the spirit of objective inquiry 
can yield rich results in the study of human society. 

To those who knew him well and to those who knew him slightly 
the passing of Wesley Mitchell is a great and numbing loss. These 
notes can give no idea of his personality, the quiet and often playful 
wit of his conversation, the genuineness of his moral seriousness, the 
consistent drive of his interests, the warmhearted attitude to and the 
broad tolerance of feUow men. His contributions to our knowledge of 
social phenomena, to the data on the basis of which more intelligent 
policy decisions are possible, to the training of a large group of scholars 
in the field, will stand; and, one may hope, will provide a foundation 
upon which work will be carried forward. But the personal loss is ii> 
retrievable, and we are all the poorer for it. 


Simon Kttznets 



BOOK REVIEWS 

Edited by 

OscAB Kbisbn Bubos 
Rutgers University 

Fraction-DefectiTe Charts for Quality Control. British Standards IrtstUviion, 
British Standard 1313: 1947. London S.W. 1: British Standards Institution 
(28 Victoria St.), 1947. Pp. 40. 6s. Paper. (New York 17: American Standards 
Association [70 East 45th St.] $2.25.) 

Review by Albert H. Bowker 
Assistani Professor of Statistics, Stanford University 
Stanford, California 

A s the title implies, this standard describes control charts for fraction 
L defective and is a revision and extension of a small part of British 
Standard 600R:1942 Quality Control Charts reviewed by Harold A. Freeman 
in this Journal (Sept. 1945, p. 386). The previous standard discusses, in 
ad<ition to fraction defective, charts for mean, standard deviation, and sev¬ 
eral other statistics computed from conlanuous measurements. Further, it 
utilizes more technical knowledge of statistics than the present pamphlet, 
which emphasizes the applied side. 

The limitation of the subject matter to the fraction defective chart, the 
omission of a discussion of the statistical principles behind the chart, and 
the internal organization of the pamphlet are designed to facilitate initial 
application of the method. The first section, entitled '‘The First Control 
Chart,*' suggests the application of a simple chart in a rigidly prescribed way. 
It recommends the selection of a product containing about 7% or 8% de¬ 
fective, using samples of twenty, sampling about 5% of the product, basing 
the process average on 25 samples, and using an upper limit exceeded with 
probability .005. Later sections discuss possible variation in amount of 
sampling, size of sampling, sampling interval, and probability limit. Other 
types of charts are discussed, including control charts based on a given 
standard rather than on an empirically-determined process average; two-way 
control charts, in which a separate control chart is kept for the per cent of 
items which exceed the upper limit and those which fall below the lower 
limit of an allowable range; and compressed limit charts for use when the 
fraction defective is very small. In this latter case, the control chart is based 
on the number of defectives outside gauge limits more stringent than the 
specification limits. 

The discussion of the first control chart assumes that the product to which 
the chart is applied is already in control and discusses appropriate action 
when an occasional point falls on or beyond the control limit. A considerable 
number of cases have been reported in American literature in which the 
izdtial application of control charts leads to the discovery that the process 

182 



BOOK REVIEWS 


133 


is badly out of control, with, in some cases, a majority of the points falling 
outside the control limits. Statistical control is apparently achieved only 
after a lengthy study and modification of the process. A description of this 
phenomenon might keep initial users from becoming discouraged if their in¬ 
dividual sample qualities fail to cluster neatly around the process average, 
as in the examples provided here. 

The discussion of control limits based on a given standard assumes the 
specification of a maximum permissible average level for defectives. Control 
limits are found by treating the specified percentage defective as the process 
average. It is clear that, if articles are produced with quality equal to this 
specified average, only an occasional point will be outside the control limits. 
Indeed, the presented quality would have to be a great deal worse than this 
maximum permissible average level before we could state with high prob¬ 
ability that an out-of-control point would be obtained. Thus, the terminol¬ 
ogy “maximum permissible” is somewhat confusing. 

In all the examples in the pamphlet, only upper limits on fraction de¬ 
fective are included. Customary American practice is to use both upper and 
lower limits. However, the reader is advised to investigate the cause of a 
consistent run bdow the process average. Another departure from common 
American practice is the use of probability limits, as opposed to 2<r or Za 
limits. Further, the expected number of defects per sample is less than some 
American authorities recommend or imply. 

The pamphlet is well done and has several desirable features. The direc¬ 
tions for setting up charts are very clear; the use of symbols has been almost 
entirely avoided; the instructions are reproduced conveniently in sample in¬ 
struction sheets at the end of the pamphlet; there are several examples VTith 
practical advice; and there are references for further study. 

The reviewer agrees with the review cited in the first paragraph, which 
concludes: “It is possible that this particular job has now been done well 
enough. The reviewer would welcome a pamphlet on the statistical theory 
on which quality control rests for this is not quite as obvious and ironclad 
as these excellent applied publications make it out to be.” 


Principles of Counting and Probabili^. /. C. Abbott (Associate Professor of 
Mathematics), and T, J, Benac (Associate Professor of Mathematics). (United 
States Naval Academy, Annapolis, Md.) Annapolis, Md.: U.S. Naval In^tute, 
1947. Pp. iii, 40. Paper. $1.00* 

Review by Herbert Solomon 
Air Intelligence SpedaMet, Air Intelligence Ditnsion 
Headqmrters, United States Air Farce, 

Washington 25, D.C. 

T his booklet is intended primarily for students in naval science, particu¬ 
larly naval gunnery. However, the illustrations and exercises are directly 
analogous to those encountered in aircraft gunnery and bombing procedures. 



134 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

The booklet is divided into two chapters: I, “Principles of Counting* and 
II, “Probability.* Answers are given at the end of the text to the 282 prob¬ 
lems posed at intervals in the booklet. An attempt at every fourth exercise 
revealed no irregularities in the published answers. 

Chapter I has some faint rumblings of set theory but mainly presents, in 
a very brief manner, material on permutations and combinations which can 
be found in any of the standard textbooks. Chapter II presents some of the 
fundamental theorems of probability in a simple non-rigorous manner which 
is, no doubt, exactly the intention of the authors. This lack of rigor will of 
course present no difficulties in working out the exercises or understanding 
the illustrations. The only probability distribution discussed is the binomial 
distribution and very little of its characteristics are studied. The multi¬ 
nomial distribution is left to the exercises. A noticeable omission is the normal 
distribution which plays a very important part in fire control studies in 
military and naval science. 

As mentioned before, this booklet is designed for a rather special group 
but except for its illustrations and exercises the material contained can also 
be found in many easily accessible books on algebra and mathematical prob¬ 
ability. 


Business Cycles and Forecasting, Third Edition. Elmer Clark Bratt (Professor 
of Economics, Lehigh University, Bethlehem, Pa.). Chicago 4, Ill.: Kichard D. 
Irwin, Inc. (332 South ^lichigan Ave.), 1948. Pp. xii, 585. $6.00. 

Review by William A. Spubb 
Professor of Bimness Statisticsj Stanford University 
Stanford, California 

T his book is the first satisfactory general survey work published in its 
fidd since the war. Professor Bratt has condensed and considerably re¬ 
vised his last edition of 1940, which had suffered the rapid obsolescence 
typical of this period. 

like its predecessors, this edition begins with a brief treatment of season¬ 
ality and long term trends, and then covers the whole gamut of business 
cycles—^their measurement, causes, theories, history, barometers, projection 
methods and proposals for stabilization. A postwar book of such scope has 
very real value for both the student and the business man. 

The topic which has required the most complete revision since 1940 is that 
of business indicators or “barometers* (Chaps. 15-17). The concept of 
gross national product is now given central importance (at the expense of 
other general business indicators), and a method of projecting its compo¬ 
nents is offered later as the core of a five-step “effective program for business- 
cycle forecasting* (pp. 437-443). An actual case illustration, however, is 
needed to clarify this very promising method. 

Another “one of the distinctive features of this third edition* which “adds 
leverage to the analysis* (p. v), is the treatment of secondary trends sepa- 



BOOK EEVIEWS 


135 


rately from secular trends. The concept of the secondary trend (which is re¬ 
lated to the long cycles or intermediate trends described so variously by 
Wardwell, Kondratieff, Juglar, Kitchin, Silberling and others), however, 
remains a shadowy one, as it does in Burns and Mitchell^s Measuring Busi¬ 
ness Cycles (New York: National Bureau of Economic Research, 1946, 
Chap 11). The results are therefore inconclusive. Bratt properly points 
out that "regularity of recurrence of the secondary trend [is] a completely 
undemonstrated conclusion” (p. 71)—^thereby avoiding a basic fallacy of 
Dewey and Dakin {Cycles. New York: Henry Holt and Co., 1947)—so that 
"no consideration can be given to the forecast of the secondary trend” (p. 
77). Later, though, he seems unduly pessimistic in concluding that therefore 
secular trends may not be projected (p. 77). 

The treatment of business forecasting (Chap. 18) is considerably condensed 
from that of the revised edition (Chaps. 21-22). One misses a detailed dis¬ 
cussion of leads and lags, specific historical analogy and several other tradi¬ 
tional forecasting procedures that may still have some validity today. 
Furthermore, the pessimistic statement (p. 420) that "we have practically 
no basis whatsoever for forecasting originating causes” (such as acts of 
government and wars) seems to be countered by the partial success of such 
Washington forecasters as Cheme and Kiplinger and by Bratt's own section 
on "Measurable Effects of Originating Causes” (pp. 401Ht02). 

Other parts of the book are revised less radically. Seasonality, secular 
trends (including growth curves) and several methods of analysing time 
series are presented in readable, nontechnical fashion. A prdiminary chapter 
on "Concepts of Balance” in the revised edition has properly been omitted. 

The detailed treatment of factors responsible for the cyclical nature of 
business (Chaps. 5-6), remains "the central part of the analysis” (rev. ed., 
p. vi), and provides a valuable analytical description of the course of a 
typical cycle. Other sections of particular interest are those on the distinc¬ 
tion between difference and summation series (pp. 90-92) and on measures 
of business confidence (pp. 416-418). 

The eclectic survey of business cycle theories (Chaps. 7-9) has been con¬ 
densed and simplified since the 1940 edition, but is basically unchanged. 
The treatment here is perhaps not quite as lucid as in Estey’s Business 
Cycles (New York: Prencice-Hall, Inc. 1941) but is more comprehensive. 

The history of business cycles has been brought up to date through the 
end of 1946 (Chap. 14). The concluding section on parallels between World 
Wars I and II (pp. 350-352) is a provocative one which would justify more 
penetrating and detailed treatment. 

The final section on proposals for stabilizing business cycles, has been ex¬ 
panded in line with the growing importance of this problem. The recent 
work of the President's Council of Economic Advisers is included. 

The chief virtue of the book as a whole is its broad, impartial survey of 
all the main aspects of business cycles, reflecting the author's accumulated 
experience in preparing three editions of this work. This reviewer fully sub- 



136 


AMERICAN STATISTICAL ASSOCIATION JOURNAL^ MARCH 1949 

scribes to Bratt’s bade policy of separating time series into secular, seasonal 
and cyclical-random elements, and his use of the analytical approach to 
business cycles developed by Wesley C. ^litchell in his 1913 and 1927 clas¬ 
sics. The third edition is thoroughly modernized and is well organized. The 
faults are minor. While it is sometimes obscure or superficial, it provides 
many references to primary sources as guides for more intensive study. The 
condensation in general is an improvement, though the printer’s crowding 
of more words per page makes for slightly more difficult reading than before. 
This book is recommended both as a test and as a general reference work. 

Theory of Experimental Inference. 0. West Churchman (Associate Professor of 
Philosophy, Wayne University, Detroit 1, Mich.). New York 11: Macmillan 
Co. (60 Fifth Ave.), 1948. Pp. xi, 292. $4.25. (London W.C. 2: Macmillan & Co., 
Ltd. [10 St. Martin’s St., Leicester Sq.] 21a.} 

Review by John W. Tuket 

Assisiunt Professor of Mathematics^ Princeton University, Princeton, N. J. 
Member of Technical Staff, Bell Telephone Laboratories, Murray HiU, N, J. 

T he author has tried to write a challenging book—^and has succeeded. By 
mixing modern statistical inference and classical philosophy he has 
written a book which could serve to introduce statisticians to philosophy 
and philosophers to modern statistical inference. The book is a far from 
perfect tool for either job, but it will, in this reviewer’s opinion, have sub¬ 
stantial influence both on philosophers and on statisticians. It is, however, 
the meaning of the book to statisticians interested in the foundations of 
their subject which ts the chief topic of this review. 

The first three chapters are devoted to a discussion from the point of view 
of formal science and philosophy of modem statistical inference, as exam- 
plified by the Neyman-Pearson theory, and a brief discussion of its relations 
with scientific method. The next chapter outlines a formal classification of 
systems of philosophy, according to thmr views on knowledge, into rational¬ 
ism, naive empiricism, statistical empiricism, criticism, relativism and, 
finally, experimentalism. The first five schools are each the subject of a 
chapter, while experimentalism, founded by E. A. Singer and supported by 
the author, is discussed in four chapters. The book concludes with three 
chapters relating inference with social groups, social purposes, and a pro¬ 
posed science of ethics, 

The most striking point to the statisrician who is concerned with the 
foundation of his subject and who believes that the millfiTminm is still far 
away is the complete acceptance by the author of a definite methodology of 
statistical inference as the methodology. The reviewer feels that the present 
methodology of statistical inference has been significantly biassed by desires 
for (i) analytic manageability, (ii) mathematical simplicity, and (iii) un¬ 
warranted umqueness. The philosopher and the statistician now need to col¬ 
laborate in working toward the ideal basis of statistical methodology. 
Another striking point is the insistence of the author on security. He ad- 



BOOK REVIEWS 


137 


mits that complete freedom from risk is impossible in practice, but he in¬ 
sists that the possibility of reducing the risk to an arbitrary small value is 
essential. Again: “And he must be sure about these things, or dse he would 
find it impossible to act efficiently; he cannot even entertain the notion that 
there are risks involved in his decisions, for if such doubts creep in, he finds 
it impossible to act quickly and efficiently” (p. 236). This sentence is sup¬ 
posed to refer to every day decisions, but it expresses the authors’ general 
philosophy. 

Finally:"... we must have criteria of the most efficient methods of solving 
problems before we can give responses to any questions” (p. 243). This 
statement is the more surprising when we recall that a “response” to the 
author is only an approximate step toward an answer. 

The author’s discussion of presuppositions and thdr importance in sta¬ 
tistical inference (p. 12) should be read by all statisticians interested in 
methodology. 

The author holds that: “In a sense, the problem of the best 'design’ of an 
experiment is exactly the problem of the philosophy of science ...” (p. 21). 
This is closely related to his quotation from R. A. Fisher: “The more thor¬ 
ough the design of the experiment, the more meaningful is the question 
asked” (p. 208). The author and the reviewer are in agreement as to the 
validity and importance of these quotations, but we draw differing conclu¬ 
sions. The author concludes that design of experiment transcends statistics, 
while the reviewer concludes that the philosophy of science is a part of 
statistics, since he defines statistics as: “The science, the art, the philosophy, 
and the technique of drawing conclusions from the particular to the general.” 
Leaving this difference aside, it follows that any adequate account of the 
design of experiment must include serious attention to the philosophy of 
science. The author is led to the following strong statement: “But there 
should be realization in statisticians’ min^ that they have pushed thdr 
basic problem beyond the fidd of formal statistics when they attempt to set 
down the criteria of best test. The danger of not realizing this point lies in 
the possible action that will result when a formtJly defined criterion of best 
is taken to satisfy nonformal demands of the science of value” (p. 283). 

The author’s solution of the philosophical problem posed by randomness 
is: “17e toouZd not he able to find randomness in our ohservaiions had we not 
first put it there in some form” (p. 124). This is conristent with his desire for 
security and his faith in a future physics without indeterminacy (pp. 77 and 
231-233). 

Since the author’s philosophy does not provide explidtiy for the critidsm 
of statistical presuppositions, it is not particularly surprising that, on page 
284, he holds that tolerance limits can be set with equal validity from small 
as from large samples. For this would be correct if the usual presuppositions 
were correct. 

At a more philosophical levd, the author condudes that science, in any 
smise, can only exist when nature is regular—“That is, the meaning of an 
observation presupposes a principle of regularity in nature” (p. 128). This 



138 


AMEEICAN STATISTICAL ASSOCIATION JOUBNAL, MARCH 1949 


position must, it seems to the reviewer, be accepted. But the author goes on 
to insist that:—“The reason that the relative frequencies must approach 
some limiting value is that the question of probability is otherwise meaning¬ 
less; one is ‘guaranteed' that they do by the natural image which is presup¬ 
posed in all experimental problems” (p. 203)—and to insist that: “The 
fundamental postulate of experimentalism, therefore, is the following: There 
exists a formalization of nature, suck that stochastic limits exist for certain se¬ 
quences of mathematical functions of the observations which are pertinent to a 
given question of fact” (p. 178). This seems to the reviewer an overstrong and 
unwise requirement of security. The pertinence of the observations is, ac¬ 
cording to the author, to be settled by “formal” methods: “. . . the justifi¬ 
cation for assuming that a certain set of actions produces pertinent observa¬ 
tions depends upon theoretical (formal) considerations on the part of the 
experimenter. These considerations must be presupposed by him in conduct¬ 
ing his experiments. The more aware he is of the nature of these presup¬ 
positions, the more exact is his experimental method” (p. 271). In the same 
vein, the author holds that *^every statistical hypothesis should he a consequence 
of a formal theory of •nature” (p. 218). The direction of this proposition is un¬ 
doubtedly good, but it goes much farther than the reviewer would care to go. 

In his approach to control, the author emphasizes the stochastic Hmit 
again: “ ... aw experiment is said to he controlled if we state all the formal 
conditions under which a mathematical function of a series of observations ap¬ 
proaches a limit stochastically” (p. 182). This strong definition is then used 
in: “iVo question of fact can he said to have meaning unless there exists a con¬ 
trolled experiment for its answering” (p. 183). The reviewer feels that this is 
a roundabout way to say that no question of fact has a meaning. 

In discussing the adequacy of formal probability theories, the author 
seems to confuse “determination in theory” and “ determination in practice.” 
He demands: “Let 0(xi, X 2 . .. ., Xa) be any random sample, with known 
elementary probability law; let t be any statistic of the sample with degrees 
of freedom at least 1; then the theory should bee able to state the elementary 
probability law of t” (p. 19), and then he asserts (pp. 19 and 30) that this 
demand has not been met in the present probability theory. While the math¬ 
ematical statistician may not be able to provide a compact and usable answer 
to many problems of distribution, he can provide a systematic and finite 
process for determining the distribution within any preassigned limits. In 
the author’s sense of “answer” it seems to the reviewer that modem proba¬ 
bility theory provides “answers” to all problems of distribution involving a 
finite number of obsen^ations. 

In the author’s discussion of the philosophy of science, this reviewer was 
struck by the statements that (i) “We may find it methodologically profit¬ 
able to keep contradictory tenets within science” (p. 192); (ii) “There is no 
true beginning-point to science” (pp. 209-210); (iii) “The time has come to 
recognize the circularity, or spiral form, of science, and the complete inter¬ 
dependence of the sciences” (p. 216); (iv) “Hence, science demands a science 
of efl&cicncy, and cannot establish such a theory within psychology or the 





BOOK REVIEWS 


139 


science of social groups. The science of ethics, for such we call the measure 
of loss, must on the one hand belong to experimental science, and yet not 
be an aspect of any of the special disciplines now recognized” (p. 250). With 
the first three of these the re^dewer is in hearty accord, on the fourth he feels 
uninformed. 

The real difference between the author and the reviewer is in their ap¬ 
proach to models, whether mathematical or formal. The author is prepared 
to take a model on its face value, apparently without consideration of its 
weak points. It does not seem to the reviewer that this is how science has 
made its great gains by the use of models. It is by combining a working 
model with more a detailed, and probably unmanageable, model which in¬ 
dicates the soft spots of the working model that science has progressed. 

Passing now from matters of opinion to matters of fact, there are a few 
specific points. On page 7, the author states that, when the null hypothesis 
holds and sample size increases to infinity, “t will have a limiting value of 
zero.” This is incorrect. On page 35, the author states that the problem of 
confidence intervals for means of later samples is ''the so-called problem of 
Tolerance Intervals.” This is a slip. On page 257, the author suggests that 
the best test is obtained by minimizing the integral of the risk over the param¬ 
eter space. This is not invariant, and of doubtful utility. A similar difGiculty 
occurs at the top of page 211. The wording of the next to the last paragraph 
on page 9 and the two-valued use of n on page 16 seems sloppy. On page 12, 
the author asserts that “we could—^find a best test” for slippage with an ar¬ 
bitrary continuous distribution. The reviewer would be interested in the 
definition of “best” and the resulting test. 

The book is singularly and pleasantly free from typographical errors—^the 
only ones noted were “procedure” on page 207 and “7ni/m2” for “Tna/TWi” 
on page 211—and is excellently printed and bound. 

The reviewer would not have taken so much space to review a book he 
judged of little use. Although he disagrees with the author on almost all the 
really basic points, he plans to use the book in connection with a course in 
the design of experiment this fall. 


Quality Control: A Manual of Quality Control Procedure Based Upon Sdentiffc 
Ainciples and Simplified for Kactic^ Application in Various Types of Manu- 
facturmg Plants. Norhert L. Enrich (Associate Professor of Management, South¬ 
western Louisiana Institute, Lafayette, La.). New York 13: Industrie Press 
(148 Lafayette St.). 1948. Pp. vi, 122. $3.00. (Brighton 1, England; Machinery 
Publishing Co., Ltd. [148 Lafayette St.].) Two renews follow: 

Review by J. H. Curtiss 
Chief, National Applied Mathematics Laboratories 
National Bureau of Standards, Washington, D. C. 


A ccormng to the Introduction, this book is intended for practical men 
in inspection who do not want to be bothered with “higher mathe¬ 
matics,” but who would like to have statistical quality control explained in 



140 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

ample terms. “Higher mathematics” here means anything beyond grade 
school arithmetic. The spirit of the book is perhaps best conveyed by re¬ 
porting the fact that the presentation of control charts for averages is so 
arranged that in setting up the control limits, the mean range of a set of 
samples never has to be multiplied by any factor more complicated than 
unity! 

Thus the author has imposed rather severe conditions of limited visibility 
on the flight of his muse. The result is a sort of minimum cook-book of 
statistical quality control recipes, supplemented by some practical advice 
on the management aspects, and by some rather sketchy and disjointed re¬ 
marks on tolerances and gages. An dementary discussion of the underlying 
theory is also given in a few pages at the end of the book. 

The statistical quality control recipes occupy the first 45 pages or so of 
the book, with a little additional statistical material (mainly on “compressed 
limit gaging” and statistical study of tolerances) scattered through later 
chapters. The two main statistical techniques discussed are lot-by-lot in¬ 
spection, using sampling by attributes, and control charts for averages and 
ranges. There is no treatment of charts for numbers of defects and for pro¬ 
portion defective. 

At the outset of the chapter on lot-by-lot inspection, the author promises 
to demonstrate later that one should not use lot-by-lot inspection on in¬ 
spection lots containing less than 300 items, but the reviewer was unable 
to find the demonstration of this theorem in the ensuing text. Sampling 
tables are given in the form of double entry tabulation, one argument bmng 
lot size range, and the other “allowable per cent defective.” There are two 
tables for discrete items, one of them containing sequential sampling plans, 
and the other tingle sampling plans with operating characteristics similar to 
the sequential ones. These tables are supplemented by two roughly parallti 
ones for use on continuous products. Although no credit is specifically given 
in the text, the sequential sampling tables are copied from the “Inspection 
Handbook on Sampling for Quality Control,” QMC-M605-15, published by 
the OfiSice of the Quartermaster General in 1945. Presumably the other tables 
are taken from material developed for later editions of the QMG Handbook. 

The concept of “allowable per cent defective” used in this book seenas to 
be a sort of mixture of Average Outgoing Quality limit (AOQL) and Ac¬ 
ceptable Quality Level (AQL) as these terms are now used in the technical 
literature. Mathematically, the “allowable per cent defective” ascribed to 
each plan is approximately equal to its AOQL, a fact implied by the brief 
tiementary discussion of the theory of the plans given at the end of the book. 
The instructions and examples, however, seem to handle the “allowable per 
cent defective” as if it were an AQL. 

The control chart for averages is set up as a test of the compound hy¬ 
pothesis that the population mean fi lies in the range Ti ^ Tt —S.lc, 

where 2i and Tt are preassigned lower and upper tolerances for individuals 
and a is the population standard deviation. The test is carried out with the 



BOOK BEVIEWS 


141 


aritlimetic mean of a sample of size 3 to 5, using a levd of significance cor¬ 
responding to a 2 (T tail of the distribution of the mean. That is: the control 
limits are given by Ti+CS.l —2/\/n)o' and r2-“(3.1 —2/\/n)<r. If 3 ^5, 

the theoretical mean value of the sample range (assuming normality) is 
roughly equal to (3.1 —2/V«)o’, so the very simple practical rule for finding 
the control limits mentioned in the first paragraph of this review is obtained. 

This type of control chart of course differs from the standard Shewhart 
control chart for averages from the viewpoint of both engineering and math¬ 
ematical theory. An obvious minor disadvantage (which, however, may be 
a grave one for the intended users of this manual) is that a pair of tolerance 
limits must be given before the recipe can be carried out. A major disad¬ 
vantage is that a fundamental Shewhart control chart doctrine is ignored: 
a principal goal in quality control is the achievement of a state of statistical 
control about stable population values of /i and o*. In the present type of 
chart, fi is theoretically permitted to wander about at will between the 
limits given above. But the book is written primarily for the line inspector, 
and not for management, nor for quality control engineers; and perhaps a 
control chart which places first emphasis on the immediate avoidance of non- 
conforming product, as this one does, is the right one to present under the 
circumstances. The control chart for the range is given the orthodox treat¬ 
ment. 

In his effort to be brief and clear, the author omitted some points which 
the reviewer considers rather essential for the proper application and inter¬ 
pretation of even the few simple techniques here treated. No operational 
meaning is given to the words ^random sampling,” which are explained 
in a circular manner by simply repeating the word "random” in a couple of 
different contexts. Rational subgroups are not adequatdy discussed in 
connection with control charts. In the discussion of tolerance ranges, the 
correct location of the range, as det^nslned statistically, is ignored, and only 
the width of the range is discussed. 

As stated before, the technical material on statistical control takes up a 
little less than half of this 120 page book, and in the opinion of the reviewer, 
it could have been presented in an unhurried pamphlet of about 25 pages. 
The density of thought in the chapters on tolerances and gaging, and in 
other later chapters, seemed rather low, and the reviewer wondered how 
much of that material would be new or useful to an experienced inspector. 
The book certainly does not begin to cover adequatdy the non-statistical 
aspects of quality control, and indeed the author would probably disavow 
any intentions in this direction. 

On the other hand, the style is simple and clear; the many examples are 
wdl-chosen and informative; and all-in-all, this is a very readable little 
treatise. It must be left to members of the intended audience, rather than 
to this reviewer, to judge whether the extra pages were well worth the time 
and effort. The judgment may very well bein the affirmative. 



142 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 


REyiEW BY E. H. MacXiece 
Director 0 / Quality Control^ Johnson & Johnson 
Xew Brunsicick, Xew Jersey 

T his book effectivelj’ leads tbe reader into a simplified quality control 
program. The program is clearly outlined and stated in such a maimer 
that the shopman is familiarized tdth the use of the method without the 
complicated terminology and mathematics so frequently found in books on 
this subject. Perhaps its greatest service will be in the conditioning of non- 
tecimical shop personnel for the acceptance of quality control as a means of 
achieving productivity in terms of acceptable quality with low waste rather 
than high production with too much of it finding its way to the scrap pile 
cr the salvage department. Mr. Enrick’s book is highly recommended as 
primary reading for men in industry who want to produce acceptable eco¬ 
nomic quality. 


Traffic Performance at Urban Street Intersections. Bruce D. Greenshields, 
Donald SchapirOy and Elroy L, Ericksen, Yale Bureau of Highway Traffic, 
Technical Report Xo. 1. New Haven, Conn.: Bureau of Highway Traffic, Yale 
University, 1947. Pp. xv, 152. Gratis. 

Review by Habry G. Romig 
Member of Technical Staff, Bell Telephone Laboratories, Inc. 

40 s West St, Xew York H, X. Y. 

T his report presents a practical as well as a statistical analysis of traffic 
data covering **the intersections of streets at grade in urban areas.” 
Its importance is readily realized since ‘^one-half of all urban traffic accidents 
and more than three-fourths of all delays experienced in dense urban areas 
are related to intersections.” The traffic engineer will find much new valuable 
material in this report, and should have it handy as each page presents im¬ 
portant details that require careful study. 

The manner of presentation is excellent. The table of contents and the 
complete index at the end are in sufficient detail to make them satisfactory 
for ready referent e. As much of the descriptive matter centers around the 
figures and tables, the authors provide a fine descriptive list of each with 
accompanying page references. The book consists of six chapters and six 
appendixes. Chapter 1 presents the techniques used in collecting the field 
data in permanent form for analysis. Photographic devices used are de¬ 
scribed in sufficient detail that others may follow the same procedures in 
making other similar surveys. Chapter 2 describes ‘^Starting Performance 
at Signalized Intersections.” Practical and theoretical solutions are pre¬ 
sented. Chapter 3 covers ‘"Deceleration of Motor Vehicles at Street Intersec¬ 
tions.” The findings of the study are given in a simple but forceful manner. 
Chapter 4 presents the ‘‘Behavior Patterns at Unsignalized Intersections.” 
The Methods of Anal^t'sis are described together with the Detailed Analysis 



book reviews 


143 

of Specific Aspects of Behavior. Chapter 5 considers “Highway Traffic and 
the Theory of Probability.” The nature of the distributions found is dis¬ 
cussed and it is shown that the Poisson series may be used effectively in the 
analysis as the Poisson distribution appears to fit the data. There are two 
distinct parts to this chapter, one describing the General Theory and a second 
covering the Theory of Itandom Distribution applied to Signalized Inter¬ 
sections. Chapter 6 describes “Typical Traffic Problems” and indicates their 
solutions. An excellent summary of each chapter is given at its close in all 
cases but the First and Fifth. Chapter 1 has no summary, while Chapter 
5 has a summary for the general theory and also a summary covering the 
case for random distributions. Also six valuable appendixes have been pro¬ 
vided dealing in mathematical relations, tables and important theories that 
have been expanded in detail to supplement the main report. 

Throughout the study, in taking pictures or making graphs, frame time 
intervals of 1/88 of a minute were used. This makes it possible to express 
velocity directly in miles per hour if measurements of distance between time 
intervals are expressed in feet, i.e., an automobile traveling 5 feet in one such 
time interval has a velocity of 5 mi./hr. Pictures were taken at sufficiently 
high devations to provide a view of the intersections studied and timing 
devices were included to permit ready identification of the different frames. 
Later, it was possible to study each frame individually or a run of frames to 
properly analyze the different conditions imder study. In addition to the 
splendid charts provided in this report, 9 photographs are included showing 
the intersections involved, and the projector and mounting. 

In studying starting performance at signalized intersections, attention was 
given to three factors: (1) time required for vehicles to commence motion, 

(2) distances reached by vehicles in given time intervals after starting, and 

(3) spacing between vehicles. Small trucks are treated the same as passenger 
cars, but buses and large trucks are studied separately. Where no signal 
occurs at an intersection or one street has a “Stop” sign, collision points were 
selected in the middle of the intersections. Velocity, delay factors, reactions 
of different drivers, and other factors were considered. Time value to the 
collision point was found to be the main criterion by which drivers decide 
to take precedence over vehicles approaching on the cross street. 

The report added much to its value by including the range of variation 
indicated by the data for the various situations covered. Its last two chapters 
discuss the application of probability theory to the problem considered and 
indicate the use of the Poisson exponential distribution. The authors indicate 
that in 1934 “the theory that traffic follows a random distribution was as¬ 
sumed by Mr. John P. Enzer in an article in which he calculated the prob¬ 
ability of any car picked at random going a mile without interference or 
delay on a two lane road with a given volume of traffic.” The authors develop 
the theory and show that with the exception of one second spacings the 
Poisson theory fits their data fairly well. The exception is due to the desire 
of drivers to avoid rear end collisions. It is possible to apply the theory and 



144 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

obtain reasonable solutions to many traffic problems, which formerly defied 
solution. 

Many numerical examples are given and also relations covering the solu¬ 
tion of different varieties of traffic problems. Approximate relations are given 
for use in solving problems when the work of computation becomes too diffi¬ 
cult for obtaining the exact theoretical solution. Chapter 5 covers the 
theoretical treatment and Chapter 6 presents solutions to a number of 
typical traffic problems. To readers other than traffic engineers it would have 
been helpful to have presented in an introduction, or in Chapter 1, the typi¬ 
cal traffic problems that are to be solved. Even after reading the report 
several times, it is not dear how the timing of red and green signals are ob¬ 
tained for the most efficient movement of traffic. When should there be a 
flashing red? a flashing amber? a policeman in control? a Stop signal at only 
one intersection? a Stop signal at two intersections? Chapter 6 is supposed to 
provide answers to some of these questions. Those preparing the report 
were doubtless more interested in the analysis of their results than in delin¬ 
eating how these results can be applied. The last example on signal timing is 
excellent. More applications of this nature should be included. Other reports 
can be made more valuable by spending a little more time at the beginning 
and end in showing how to use the findings. 


Mathematics of Samj^ling. WaUer A, Hendricks (Principal Agricultural Statis¬ 
tician, Bureau*'of Agricultural Economics, Washington 25, D. C.). A summary 
of a course of lectures given during the 1947 Statistical Summer Session at Vin- 
diua Polytechnic Institute. Virginia Agricultural Eiroeriment Station, Special 
Technical Bulletin. Blacksburg 13, Va.: the Station, February 1948. Pp. li, 45. 
Gratis. 

Review by T. A. Bancroft 

Research Professor of Statistics and Director of the Statistical Laboratory 
Alabama Polytechnic InstitvtCj Avbum, Alabama 

A MTHOUGH the material for the most part is not new, having been taught 
in survey sampling courses at various statistical centers, in particular at 
Iowa State College, it presents in published form an introduction to the 
mathematics of survey sampling. It should be a wdcomed addition to the 
unfortunately small amount of published material available in this rapidly 
expanding field of statistics. The booklet should be of value as a reference for 
workers engaged in survey sampling as well as for teachers and students of 
its theory and practice. 

The mathematical aspects stressed are those that are basic to an under¬ 
standing of the sampling designs and analyses used in actual sampling sur¬ 
veys conducted at the present time and for the most part by various federal 
agencies and certain universities with strong statistical sections. Since the 
booklet is concerned with the mathematics of sampling, no attempt is made 
to discuss techniques of planning, schedule or questionnaire construction, 
organization, field operations, etc. A good idea of the tyx>e of material dis-* 



BOOK REVIEWS 


145 


cussed can be obtained from the following list of headings: Classical Error 
Theory, Random Sampling in Practice, Analysis of Variance and the Esti¬ 
mation of Variance Components, Stratified Sampling, Subsampling, Cluster 
Sampling, Binomial and Multinomial Sampling Variation, The Problem of 
Nonresponse, Linear Regression, and The Method of Least Squares. A 
selected but valuable list of references is given in a section on suggested 
reading. In the reviewer’s opinion the value of the booklet would have been 
greatly enhanced by the addition of sections on: choice of sampling unit, 
determination of sample size, confidence intervals, double sampling, and 
variances of totals based on various methods of estimation. 

Although the title of the booklet contains the word “Mathematics,” no 
attempt has been made to give either rigorous detailed mathematical proofs 
or to introduce useful powerful mathematical concepts or machinery to 
shorten such proofs. Instead the heuristic approach has been used, the details 
of proofs in many cases being suggested rather than explicitly stated. General 
theorems, probability distributions, and formulas have been advanced as 
true because of their analogy with simpler cases. The manner of presentation 
is understandable since the booklet is a summary of a few lectures covering a 
broad field. It seems to the reviewer that a valiant effort has been made, by 
the use of these methods, to make the methodology of survey sampling rea¬ 
sonable to workers engaged in this field who may have a modest background 
in the elements of mathematical statistics. If such be the case, it seems to the 
reviewer that on the whole the author has been successful with one important 
exception. It is the opinion of the reviewer that the fundamental assumptions 
and limitations involved in setting up various mathematical models, espe¬ 
cially in the case of the analysis of variance and of such proofs as indicated in 
formula (62) and the simpler case at the bottom of page 11, should be given. 
It is true that an indication of the fundamental assumption of linearity in 
the analysis of variance model is given in equation (50), but in the reviewer’s 
opinion a greater understanding would have resulted from beginning with 
the usual assumptions, i.e., =u +/< etc., and with detailed definitions, 
even though the derivation in the latter case is longer. 

No mention seems to have been made of the formulas for the variance of 
a product and the variance of a quotient. For the sake of comparison, it 
would seem desirable to give a proof of the usual formula found in the 
literature for the variance of the mean of a sample from a finite population 
in addition to the one given in the booklet. 

There are several typographical errors. Also the reviewer differs with the 
author on several points of notation. On page 6 in (28) and again on page 
15, r has been used in place of p. On page 12, Table 2, should be added to 
K<ro* for the mean square between classes. On page 9, in (41), it would seem 
more appropriate to replace s* by (O* ^ is defined by (38). Page 13, 

(49) and (50) should have kb and S** respectively on the left sides. In solving 
for various variance components, greater clarity should result in replacing 
(Tq^ and 0 ^ in the equations of estimation, by Si and 6^. 



146 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

Principles of Medical Statistics, Fourth. Edition. A. Bradford Hill (Professor 
of Medical Statistics and Director of the Department, London School of Hygiene 
and Tropical Medicine, University of London, London, England). London 
W.C.2: Lancet Ltd. (7 Adams St., Adelphi), 1948. Pp. ad, 262.10s. 6d. 

Review bt Margaret Martin 
ABsiBtant Professor of Preventive Medicine and Pvblic Health 
Vanderbilt University^ Nashville^ Tennessee 

N ews that A. Bradford Hill’s excellent book on medical statistics is again 
available, in the form of an enlarged fourth edition, is indeed welcome. 
The principal changes from earlier editions are the addition of a new chapter 
on averages, a section on the normal curve, and the expansion of the chapters 
on frequency distributions and graphs, chi square, life tables, and standard¬ 
ized death rates. 

The clarity of the presentation, the emphasis on the meaning and inter¬ 
pretation of statistical results, and the inclusion of numerous examples 
illustrating the dangers of careless statistical thinking account for the popu¬ 
larity which this work has enjoyed since its first appearance as a series of 
articles in The Lancet in 1937. Medical students, physicians, and other work¬ 
ers in the medical fields who wish to gain an understanding of the prindples 
of elementary statistics will find it most helpful and stimulating. 

On the whole the selection of material to be included in this elementary 
text of nonmathematical character is excellent. The reviewer feels that it 
would be desirable to have included a table of probabilities for the normal 
curve to be used in significance tests, especially since such a table is given for 
chi-square; that in the discussion of significance tests for proportions, more 
detailed consideration might have been given to the conditions necessary 
for reasonably reliable application of normal curve theory and to the correc¬ 
tion for continuity; and finally, that follow-up studies in which cases are 
under observation for fractional parts as well as for whole numbers of years 
might have received more complete treatment. On the other hand, the calcu¬ 
lation of the average length of after-life in a study in which life experience is 
not complete for all patients (p. 173) does not seem to be particularly useful 
and might, in fact, lead to misinterpretation. 

A few minor errors have been noted. In the calculation of the median of 
grouped frequency distributions (pp. 49-51), the point below which there 
are (iV +l)/2 instead of N/2 observations, assuming that the observations 
are evenly distributed over the interval in which the median falls, is obtained. 
In the diagram on page 65 the intervals labeled one, two, and three standard 
deviations, respectively, are actually twice this amount. The appearance 
of a “minus sign” in line 6 of page 74, when algebraic signs have been ignored 
in corresponding situations in earlier examples (i,e. in a correction term 
whichisto be squared), might cause some confusion to the reader. In the defi¬ 
nition of the weighted mean on page 245, some necessary parentheses have 
been omitted in the numerical example. In the definition of the chi-square 



BOOK REVIEWS 147 

test on page 246, the word “frequencies’* would seem to be more appropriate 
than the word “values.” 


An Experimental Introduction to the Theory of Probability. J, E. Kenrich 
(Senior Lecturer in Mathematics, University of the Witwatersrand, Johannes¬ 
burg, South Africa). Copenhagen, Denmark: Einar Munksgaard, 1946. Pp. 98. 
Paper. 


Review by J. F. Kenney 

Associate Professor of Mathematics, University of Wisconsin 
623 West State SU, Milwavkee S, WiscoTbsin 


U ndoubtedly many teachers have had experiences similar to those of the 
author in presenting lectures on elementary statistics to mixed classes 
of students and colleagues who vary widely in mathematical preparation 
and whose interests lie in diverse branches of science. ''It is a most investing 
problem,” the author remarks, “to design lectures suitable for such a class.” 
An opportunity to design suitable material on one topic came to him when 
he found himself interned (for his own safety, as a British subject) by the 
Danish government during the recent war. Thus he had the leisure and pa¬ 
tience to conduct the simple but extensive experiments on random events 
that comprise the subject of this book. The main experiments consisted of 
spinning a coin 10,000 times and drawing 5,000 times two ping-pong balls 
out of four of which two bore a red trade-mark and two a green trade-mark. 
(The drawings were made by a fellow-internee “at a rate of about 400 times 
an hour with—^need it be stated—^periods of rest between successive hours.”) 
Also an experiment equivalent to tossing a biased coin was performed with 
a small wooden disc coated on one face with lead. 

Various results from these experiments are recorded in tabular and graphi¬ 
cal form. Data are analyzed both in the large and in sub-sequences with re¬ 
spect to various ratios such as m/n where m denote number of heads and n 
number of spins, and nh/(mi -fTTia) where denotes the number of times that 
green was second in the mi+m 2 experiments in which red appeared first. 
The analysis leads to a body of ideas, namely, a mathematical theory which 
describes the observations. Thus the author arrives at the tools of pure mathe¬ 
matics. Using appropriate symbols he discusses complementary, joint, 
mutually exclusive, compound, and conditional events, the addition and mul¬ 
tiplication principles, and the binomial distribution. The normal distribution 
is mentioned briefly and an introduction is given to the notions of estima¬ 
tion and confidence intervals. 

In the reviewer’s opinion the author has admirably achieved his objective 
as stated in the Foreword: “In this book, a Kttle ground is covered thor¬ 
oughly and great pains are taken to try to present a clear picture of the physi¬ 
cal significance of a mathematical probability. With this background the 
student will be better equipped to study the many texts which deal with 
'pure’ theory based on a system of axioms.” And his hope is well founded 



148 A3MERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

when he says: "It is hoped that students of these pages will never have to 
reject any of the ideas given here, no matter how much they may refine them 
as their knowledge of the subject grows.” 


Statistical Methods in Medical Research: I, Qualitative Statistics (Enumera¬ 
tion Data). Donald Mainland (Professor of Anatomy, Dalhousie University, Hali¬ 
fax, Nova Scotia). Reprinted from the Canadian Journal of Research, Section E, 
Medical Sciences 26 (1): 1-181 February 1948. Ottawa, Canada: National Re¬ 
search Council of Canada, 1948. Pp. 181. Paper. Apply. Two reviews follow: 

Review by John W. Febtig 

Professor of Biostatistics, School of Public Health of the Faculty of Medicine 
Columbia University, New York 32, N, F. 

T his article is essentially an expansion of Chapters 2 and 3 of the author's 
The Treatment of Clinical and Laboratory Data (Edinburgh, Scotland: 
Oliver and Boyd Ltd., 1938). A detailed consideration is given to small sam¬ 
ples of enumeration data in which the normal curve or chi square solutions are 
not completely satisfactory. Fifty-four pages of tables are presented giving 
confidence limits for a two-fold classification of enumeration data and prob¬ 
abilities or significant differences for four-fold contingency tables. There is 
also a table of chi square and one of four place logarithms of factorials. The 
text covers 103 pages and is divided into an introductory section of 11 pages, 
one of examples covering 54 pages, and one of explanatory semi-theoretical 
notes covering 36 pages. Most of the examples are concerned with the com¬ 
parison between a sample and a population rdative frequency and with the 
four-fold table, including numerous variations of these problems. The prob¬ 
lem of non-dichotomous scales is only briefiy considered, as is the problem of 
combining information from two or more samples. 

The suggested treatment of the numerous examples consists largely in 
aiding the reader utilize the tables contained in the article. Practically no 
treatment of the rationale of the method is given at the time of the discussion 
of the example. This is reserved for the section on notes. Each example is, 
however, followed by a series of helpful comments. While this reviewer recog¬ 
nizes that the investigators for whom this presentation is intended are often 
not very patient with a discussion of the reason for a certain statistical 
method, he still feels that the incorporation of the section on notes together 
with the examples would have produced a much better appreciation of the 
techniques. 

It is sometimes diflScult to appreciate the reason for the author's pref¬ 
erence for chi square, for example, on page 3 9: "So me investigators still use 
the standard de'V’iation or standard erroi,^/Npq, instead of chi square, for 
comparison of the sample. This is not to be recommended.” It is not pointed 
out clearly that the correction for continuity can be used for the normal 
curve as well as for chi square. The author recommends the summation of 



BOOK BEVIBWS 149 

chi square values for combining information from several samples, but this 
method may at times be unduly conservative. 

The author has to some extent achieved his goal of classifying certain 
types of problems relating to enumeration data, and of telling the investi¬ 
gator how to find his problem and a suitable answer for it. The tables sup¬ 
plied are indeed very comprehensive and useful. However, it seems to this 
reviewer that the approach is too mechanistic and would be unsatisfactory 
for many investigators. 


Review by A. Bradford Hill 
Professor of Medical Statistics and Director of the Department 
London School of Hygiene and Tropical Medicine 
University of London^ London TF.C.i, England 

D r. mainland says that his article was devised ^to meet the wishes of 
those who, in the words of one investigator, would say: 'I have a problem 
on hand .. . Must I spend a month of free evenings reading a book from 
end to end several times and mastering all details before deciding how to go 
about solving the problem? I hope not.’” He might have retorted that work¬ 
ers in the medical sciences would not expect to use, say, bacteriological or 
pathological techniques without mastering the details and is there any rea¬ 
son why statistical processes should be regarded differently? There is, per¬ 
haps, at least an excuse. Few workers unless trained in such subjects as 
bacteriology or pathology will be so bold as to embark upon them; almost 
all, whatever their subject matter, will sooner or later be faced with statis¬ 
tical data and have to interpret them. Often too, particularly in clinical 
medicine, their numbers of observations will be small. It is, therefore, legiti¬ 
mate to argue that it is better to give the worker easy access to tests of sig¬ 
nificance which he may imperfectly understand rather than to let him rdy 
solely upon that “common-sense” which is, in fact, so uncommon. 

The serious danger of this procedure, which Dr. Mainland recognizes, is 
that the worker may come to regard the mathematical tests as the most 
important part of the statistical methodology and forget that of much 
greater importance “are, first, the planning of the experiment or observation 
so that valid inferences shall be obtainable, and, secondly, the interpretation 
of the results of the mathematical tests.” In the experience of the reviewer 
the latter is the greater risk, that too frequently today there is a tendency 
to regard “non-significant” as implying guiltless rather than non-proven, 
“significant” as proven and therefore dm to a particular causal factor. Dr. 
Mainland has certainly endeavoured throughout his article to guard against 
these very undesirable by-products of his plan of presentation. 

This plan is as follows. In an introductory section he discusses, briefly, 
some general principles and definitions—random sampling, probability, 
confidence limits, the comparison of samples, levels of significance, etc. He 



150 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 


then passes to what is the crux of the article for the investigator quoted at 
the beginning of this review, namely the working out of 40 examples of medi¬ 
cal problems classified so that the worker can choose data and problems com¬ 
parable to his own and then easily carry out the demonstrated probability 
calculation. As the article is confined to qualitative statistics the types of 
problem are mainly the argument from a sample to its population and 
the comparison of two or more samples (with subsidiary questions that 
flow from them). A final section of “notes” discusses the underlying prin¬ 
ciples and methodology in much greater detail and the article concludes 
with some extremely useful tables. These comprise binomial confidence 
limits (with graphs as well) over a wide range of size of sample and of values, 
and also exact probabilities for small-sample fourfold contingency tables— 
the probabilities for equal samples up to N equals 20 and the significant 
differences for unequal samples up to Ni equals 20 and N* equals 19. These 
latter should clearly be of great help to many workers, as will also a table of 
the logarithms of factorials of numbers up to 1,000 for the calculation of 
exact probabilities not tabulated. For samples not covered by the tables 
precautions and rules regarding the use of chi squared have been derived 
from more than five hundred comparisons between chi squared and the 
exact method. 

A criticism that might be made is that Dr. Mainland is rather inclined to 
overstate the case for using the “exact” methods he gives—^how often, in 
fact, would the observer be misled by the cruder methods if he were cautious 
in borderline cases?—and to place considerably too much confidence in the 
results given by very small samples. While agreeing with him that “no 
sample is too small for statistical assessment” one may yet, for instance, 
with a mere handful of sick persons to compare remember their innate vari¬ 
ability and Dr. Mainland’s own emphasis on the importance of “the inter¬ 
pretation of the results of the mathematical tests.” However that may be, 
this heavy piece of work should certainly help the medical investigator to 
apply without tears his tests of significance to small samples—^though it is 
unlikely that he will do so intelligently unless he is prepared to take some 
trouble to understand what it is all about. 


Mathematical Theory of Human Relations: An Approach to a Mathematical 
Biology of Social Phenomenon. N. Rashevsky (Associate Professor of Mathe¬ 
matical Biophysics, University of Chicago. Ciucago 37, Ill.). Mathematical 
Biophysics Monograph Series No. 2. Bloozmngton, Ind.: Principia Press, 1948. 
Pp. xiv, 202. $4.00. 

Review by Frederick Mostelleb 
Associate Professor of Mathematical Statistics 
Department of Social Relations, Harvard University 
Cambridge 88, Massachusetts 

R ashevsky’s Mathemaiical Theory of Human Relations has the subtitle 
“An Approach to a Mathematical Biology of Social Phenomena.” It 
is interesting to notice that Rashevsky feels that such topics as distribution 



BOOK REVIEWS 


151 


of city sizes, economic interaction of the social group, variations in the class 
structure of a group, individualistic (capitalistic) and collectivistic (cooper¬ 
ative) societies, history of nations, theory of war, all come within the scope 
of mathematical biology. This view fits in well with the present breakdown 
of borders between sciences. 

The writings are largely publications of Eashevsky in Fsychometrika 
collected in such a way as to provide continuity to the exposition. The con¬ 
tinuity is more one of method than of subject matter. The casual reader 
will find that the topics dodge around rather rapidly. Indeed, the book is 
something of a hodge-podge. It contains many early thoughts not very thor¬ 
oughly worked out but apparently put down quickly as they came to mind 
by a rather prolific but not very elegant writer. 

The principal method employed is that of differential equations, used 
somewhat in the manner of the applied physicist. One has the feeling that 
the problems were made to fit the mathematics with which the author has 
been successful in treating other problems, rather than making the mathe¬ 
matics suitable to the problems. Occasionally the author sidles into integral 
equations but no serious attempt is made to do anything about them. 
However, the integral equation approach did look rather promising before 
it was dropped like a hot potato. 

In reviewing such a book about human relations, one has to consider the 
scarcity of mathematical works on this subject outside the fields of economics 
and population. Naturally statistical methods are widely used in all social 
sciences, but these are usually employed for descripitve purposes rather than 
as mathematical models. Occasionally there are statistical or probability 
models which can be classified as mathematical models in the sense that they 
try to explain the way certain processes combine to produce a certain out¬ 
come, and these models have the property that many different aspects of 
the situation can be derived from the original set of assumptions. The work 
of Zipf has been largely the collection of certain kinds of number anomalies 
with guesses about the sociological meaning of these anomalies, while 
Stewart working on the same subject seems to be trying to build up a theory 
leading to these number anomalies from a set of assumptions. Starting from 
theory which he has developed for another purpose, Eashevsky tries to 
investigate the distribution of city sizes but is not very successful. Lewis F. 
Eichardson in Generalized Foreign Politics (British Journal of Psychology 
Monograph Supplements, No. 23, 1939) attempts to study the theory of 
stability of peace between two or more nations largely by the study of the 
behavior of linear differential equations. Eichardson is not quite so ambitious 
as Eashevsky. He studies conditions which will lead to war and does not 
attempt to say when war will occur, nor when one side or the other will be 
defeated, nor how the action will be carried out, while Eashevsky does make 
attempts of such a nature. The contrast between the work of Eichardson 
and that of Eashevsky is worth noting because the one man takes a single 
topic and works it very extensively, while the other prefers to handle many 
topics thinly. 



152 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1049 


Eashevsky places a critic of his work in a very difl&cult position. He states, 
as we would expect any man building mathematical models to do, that none 
of the models he presents are to be taken as sacred or complete or more than 
a gross oversimplification of the techniques he has in mind. Further, even 
when he goes out of his way to get some data and compare his theory with 
some facts, he claims that no one should take the results of the comparison 
seriously as supporting the particular theory he has in mind. In every case 
as far as the reviewer can tell, he regards his examples as “only an illustra¬ 
tion^ of what a man constructing mathematical models might hope to achieve 
and improve on if he were to make a careful extensive study of the problem. 
This attitude makes it difficult for us to know whom Eashevsky wrote the 
book for. Presumably a man familiar with mathematical models would know 
something about the kinds of things that might be achieved through the use 
of mathematics and therefore would not need all these illustrations. The 
social scientist who does not know himself how to handle mathematical 
models will probably feel that instead of producing all these illustrations of 
what might be accomplished, Eashevsky might have done better to take one 
problem and work on it. His attitude might be that one good investigation 
would win him over. Probably Eashevsky protests more than he means and 
really feels he has a fairly general approach to many social science problems, 
and he may even feel that he has produced a good framework for building. 
In addition, Eashevsky may also feel that the reason so little work has been 
done by applied mathematicians on social science problems (outside the 
afore-mentioned fields of economics and population) is that the mathemati¬ 
cians see no way to attack these problems; that if encouragement of the kind 
he is offering is given, research people may see their way clear to relieving 
the scarcity of work in this field. It is very possible that this book may have 
the effect of goading researchers to work in this field, because some may feel 
that Eashevsky has stated his problems poorly and that too many problems 
are left wide open by Rashevsky^s approach. If the book produced only this 
effect, the author will have made an important contribution to the develop¬ 
ment of social science. 

The book opens with a “Preface and Explanatory Remarks”—section in 
which the author gives some arguments why mathematics should be allowed 
to be used in the study of social phenomena and includes a fairly lengthy 
criticism of this work by the author. Indeed, anyone wishing to criticize 
this book will be helped by reading Rashevsky's own criticisms. Chapter 1 
considers the nature and effect of the influence of one individual on another 
and provides a definition of social class. Generalizations are achieved by 
introducing the notions of distribution of individuals in space, and in time, 
and the notion of social mobility. It is imfortunate that this important first 
chapter is not written a little more carefully; for example, on page 3 line 6 
the reader is confused between an activity and the intensity of an activity. 
On page 4 the author has not been careful about his use of absolute value 
signs or else he has changed his assumptions without informing the reader. 



BOOK BBVIBWS 


153 


It is likely that the reader will be a little startled to see a multiple integral 
with definite limits suddenly lose two of its three integration signs with no 
caution from the author that the single integration sign is to stand for all 
three, even though it has definite limits attached which dififer from those on 
the other two. 

Chapters 1 and 2 are largely confined to a discussion of functions of several 
variables with no definite form assigned. Averages or expected values are 
largely used. In the study of individuals grouping themselves into classes 
the author considers the case of one variable F and defines individuals as 
belonging to the same class when (F'—F)^—A*<0, where the F's are the 
values of the characteristic determining the class structure for the two in¬ 
dividuals and A is some constant. For large groups of individuals, the extent 
of the upper class is found by averaging the left-hand side of the above in¬ 
equality over a portion of the joint distribution of two individuals inde¬ 
pendently drawn from the distribution of the characteristic. If we know 
N (F), the distribution of the characteristic, and the number of classes in 
the society, we can in principle calculate A. 

In Chapter 3 Rashevsky gives an approximate treatment of the interaction 
of social classes in which he uses an a!l-or-none principle, that is, either 
individuals are “active” or “passive.” The active individuals belong to two 
groups each with a single activity in which they try to persuade the passive 
members to join them. The problem seems to be to see what kinds of condi¬ 
tions will lead to all the passive individuals occup 3 dng themselves with one 
activity or the other. Rashevsky feels that his results agree in general with 
the rapid spread of mass hysterias and revolts, and the reviewer feels that 
the ideas may approximate the results observed in fads, fashions, rumours, 
or propaganda. It is interesting to note that the differential equations pro¬ 
duced are of the same form as Richardson’s armament equations. This is 
not surprising because Rashevsky is dealing with warfare between two 
groups for possession of a third. Rashevsky’s equations simplify more than 
Richardson’s because of an additional restriction. Richardson is not men¬ 
tioned. 

Chapter 4 is an extremely useful chapter from the point of view of the 
social scientist not well acquainted mth mathematical models and the ade¬ 
quacy of various kinds of approximations met with in mathematical physics, 
la this chapter entitled “A More Exact Treatment of the Previous Case,” 
Rashevsky shows that allowing individuals to distribute themselves on an 
active-passive scale instead of forcing each individual to assume one or the 
other end of the continuum can lead to essentially the same results as those 
given by the more approximate methods of Chapter 3. This procedure can 
teach the social scientist that a process of simplification in mathemaiicai 
models does not necessarily lead to the loss of essentials. In other words, one 
should not use too glibly the pat phrase: “Of course, this treatment is much 
too over-simplified to be of any real use.” 

Chapters 5 and 6 deal with economic problems raised by the existence 



154 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1949 

of persons in a society who are so good at organizing that the whole group 
can gain if the workers will work under the direction of the organizers, and 
the organizers are willing to organize the workers. 

Chaptei 7 might interest the practical man, although the illustrations may 
seem to him to be tours de force. It suggests how previously developed theory 
can be applied to estimating the ratio of the population of the capital of a 
country to the urban population less that of the capital, using the proportion 
of national income taken in taxes. The population of the capital is taken as 
an index of the size of the governing class and the urban population as an 
index of the total number of active individuals in a society composed of three 
classes: the governing, the organizing, and the passive. The results for Ger¬ 
many, France, and the United States look very good, but for England the 
capital is too large. The reviewer does not think population of a capital a 
good enough index of size of governing class to make the example convincing. 
Another example is the prediction of the incidence of crime from taxes and 
population density, while still a third example deals with the divorce rate. 
The fit of the calculated quantities to the observed data is rather encouraging. 
It should be mentioned that in Chapter 6 by a sequence of crude approxima¬ 
tions a formula is obtained for estimating the per capita income of a country 
in terms of its urban population percentage and its population density. The 
results do not seem to fit very well in this case. In all these cases Eashevsky 
feels that the real interest attaches to the fact that certain relations are sug¬ 
gested even by an inadequate theory which then helps us notice such rela¬ 
tions when they occur. 

Chapter 9 is concerned with two notions of individual freedom: the first 
concerns economic freedom and is rather suggestive, the second deals with 
freedom of an individual to choose among many activities and seems to the 
reviewer to fall flat on its face. 

Chapter 10 deals with the distribution of the per cent urban population in 
a growing society and considers two or three possible assumptions. Data are 
given showing population of a country against per cent urban. The curve de¬ 
rived for the United States fits the data pretty well, although a straight line 
would fit them better, but that for Germany is extremely convincing. The 
data for Eussia (7 points) are fitted by a two-branch curve with the aid of 
arguments about the reform history of the country. The data for Sweden 
are the most interesting available, but are dismissed with the statement 
that the theory is inadequate to explain them although the rapid reader 
might think that the excellently fitting curve shown is a derived one. The 
reviewer cannot tell whether this curve is derived or not but suspects that it 
is a free-hand fit. 

Chapters 8 and 9 deal with city sizes and do not seem to reach a very suc¬ 
cessful conclusion. 

Chapters 14 and 15 deal with social classes, social mobility, production, 
and the eflects of restrictions. Chapter 17 concerns some consequences of 
previous theory and the theory is extended to estimate the percentage of 



BOOK REVIEWS 


155 


per capita income spent for military purposes in various countries and also 
the number of inventions by various countries. The theory also suggests 
that the “influence” of a country is proportional to N^/Sj where N is the 
population and S the area. Graphs for various countries from 1600 to the 
present do not violate a reader’s intuition about which countries had great 
influence during this period. 

Chapters 19 and 20 are concerned with individualistic as opposed to col- 
lectivistic behavior and here Rashevsky draws heavily on G. E. Evans’ 
Mathematical Introduction to Economics. The principal result is that indi¬ 
viduals may profit more individually by trying to maximize the group satis¬ 
faction rather than their individual satisfaction. 

Chapter 21, “Some Considerations of the History of a Few Nations,” dis¬ 
cusses largely in hand-waving terms what happens under various degrees of 
interclass mobility. In other words, the happenings in several different coun¬ 
tries, Russia, China, England, and the United States are talked about in 
terms of some of the theory, although not really derived from the theory. 
As in many history books, the discussion of the United States concludes 
“To what extent this shift toward governmental control will continue cannot 
be predicted on the basis of the present theory” (p. 180). 

Chapters 22 and 23 have to do with physical conflict between groups or 
nations. The theory developed is one of the variations of Lanchester’s Law, 
although Lanchester is not mentioned. 

A few misprints, mostly minor, were noted. The more important are: 
page 17, equation (9), C should be (; page 18, line 5, > should be page 
78, equation (6), delete second equal sign; page 84, equation (16), bar of 
Radical should not cover second term. 

The most important thing is that a book has appeared which tries to 
treat a variety of social problems by means of mathematical models. That 
the attempts have met with varying degrees of success is not too important. 
The results given are certainly successful enough to encourage others to 
make further attempts. Indeed, some of the basic material presented here is 
worth extending along the lines indicated by the author and worth supple¬ 
menting with practical numerical examples drawn from data. 


CIBOBOB BABTTA FUBIiISHmO OOlCPANY, ICBITASHA, WISOOKSIB 



i 



JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Volume 44 


June 1949 


Numbeb 246 


ARTICLES 

The Current Status of State and Local Population Estimates in the Census 

Bureau . . . Henry S. Shbyogs:, Jr., and Norman Lawrence 157 

The Uses and Usefulness of Binomial Probability Paper. 

.Frederick Hosteller and John W. Tukey 174 

Teaching Statistical Quality Control for Town and Gown. 

.Edwin G. Olds and Lloyd A. Knowler 213 

The Use of Sampling in Great Britain.C. A. Moser 231 

Unemployment and Migration in the Depression (1930-1935) .... 

.Ronald Freedman and Amos H. Hawley 260 

Minimum and Maximum Likelihood Solution in Terms of a Linear 

Transform, with Particular Reference to Bio-Assay. 

.Joseph Bereson 273 

Some Inadequacies of the Federal Censuses of Agriculture. 

.Raymond J. Jessen 279 

The Edge Marking of Statistical Cards.A. M. Lester 293 

Conrad Alexander Verrijn Stuart (1865-1948) . Walter F. Willcox 295 

PROCEEDINGS OF THE 108TH ANNUAL MEETING 


Minutes of the Annual Business Meeting.297 

Report of the Board of Directors.300 

Report of the Secretary.303 

Report of the Nominating Committee.304 

Report of the Committee on Fellows.304 

Minutes of the Meeting of the Commission on Statistical Standards and 

Organization.305 

Report of the Treasurer.307 

Report of the Auditors.308 


BOOK REVIEWS 

Anderson, J. L. and Dow, J. B., Actuariid Statistics: Vol. IJ, Construction 

of Mortality and Other Tables .T. N. E. Greyzllb 311 

Chapin, F. Stuart, Experimental Designs in Sociological Research 

.Margaret Jarman Hagood 312 






















Chablieb^ 0. V. L., Elements of Mathematical Statistics Including Table of 
Poisson^s Function by L. V. Borthiewicz . . Bttbton H. Camp 313 

...Alexandeb M. Mood 314 

Douglass, Raymond D., and Adams, Douglas P., Elements of Nomog^ 

raphy . Joseph Zubin 315 

Gallup, Geoboe, A Guide to Public Opinion Polls, Second Edition, . . 

.Robebt Cobb Myebs 315 

Hald, a., The Decomposition of a Series of Observations Composed of a Trend, 

a Periodic Movement, and a Stochastic Variable . 

.D. B. DeLuby and Boyd Habshbabgeb 317 

Kennedy, Clippobd W., Quality Control Methods . 

.Sebastian B. Littaueb 320 

.Chables R. Scott, Jb. 322 

MatjSbn, Bebtil, Metoder att Uppshatta Noggrannheten vid Linje- och 
Provytetaxering. [Methods of Estimating the Accuracy of Line and Sam¬ 
ple Plot Surveys] . T. W. Andebson 323 

von Mises, Richabd, Lecture Notes on Mathematical Theory of Probability 

and Statistics .. Benjamin Epstein 326 

SuMNEB, W. L., Statistics in School . F. G. Cobnbll 327 

VAN UvEN, M. J., Mathematical Treatment of the Results of Agricultural and 

Other Experiments, Second Edition . G. A. Bakeb 329 

Wintneb, Aubel, The Fourier Transforms of Probability Distributions 

. J. WoLPOwiTz 330 

Wold, Hebma^ Random Normal Deviates: 25,000 Items Compiled from 
Trad No, XXIV {M, G, KendaU and B. Babington Smith’s Tables of 
Random Sampling Numbers) .. H. Bubee Hobton 331 

Zeisel, BLans, Say It With Figures .. . Gbegob Sebba 332 


1888-1939, and Annnal Indexes thereafter, 
Secretary of the AMERICAN STA- 


















JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Number JUNE 1949 Volume U 


THE CURRENT STATUS OP STATE AND LOCAL 
POPULATION ESTIMATES IN THE 
CENSUS BUREAU* 

Hbnkt S. Shbtock, Jr., 

AND 

Norman Lawrence 
Bureau of the Census 

E arly in this century, the Bureau of the Census was already 
experimenting with postcensal State and local population esti¬ 
mates based on such symptomatic data as the number of public utility 
consumers, voting registrations, counts from city directories, and 
school census figures. Estimates were apparently made by the simple 
ratio method, and it was decided that they were not so satisfactory 
as those made by a simple mathematical method. For many years, 
virtually all the published estimates were linear projections of the 
last two census figures. Later, when the growth of birth and death 
registration areas made possible fairly accurate estimates of the na¬ 
tional total population, an apportionment formula was used to prorate 
the postcensal national growth among the States and cities in accord¬ 
ance with their previous relative growth. 

Although many estimates by these two methods were published, the 
Bureau recognized that the average errors were large enough to cast 
doubt on their usefulness. Sudden changes in population trends during 
the early years of the depression of the ’thirties made it obvious that 
these mathematical assumptions could no longer be used. In 1936 the 

* Adapted fxom a x>aper read by the first-naiDed author before a Customer Administrative School 
for Public Health Executives and Vital Statistics Registrars at the International Business Machine 
Corporation, Endicott, N. Y.. July 2,1047. 

The authors wish to express their appreciation to Mr. Benjamin Greenberg, who assisted in the 
preparation of the population estimates given here, and to Dr. Joseph F. Daly, who directed their 
attention to the new tests of statistical significance in an unpublidied thesis by John Edward Walsh of 
l^inceton University. 


157 




158 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


Bureau published its first series of State estimates by what has been 
called the “migration and natural increase method.” Here the com¬ 
ponents of population change, natural increase and net migration, were 
estimated separately, the migration component being derived from a 
comparison of actual with expected school enrollment. Current State 
estimates were discontinued after 1937 pending the results of the 1940 
census. It had been planned to resume publication after 1940 using an 
improved migration and natural increase method, based on school 
data or perhaps some other symptomatic series; but the war changed 
the situation in several ways. 

The program of current population estimates in the Census Bureau 
received a windfall in the registrations for war ration books in 1942 
and 1943. Primarily from this source, several series of State and county 
estimates were constructed for these years.^ These were the most 
comprehensive such postcensal estimates based on empirical data ever 
published. On the average, they were undoubtedly the most accurate 
also.^ With a few exceptions, no population estimates for cities or 
counties have been published by the Bureau since those for 1943. 

The valuable war ration book figures were not, however, an un¬ 
mixed blessing to producers and consumers of population estimates. 
Their existence interrupted the experimentation with the above- 
mentioned symptomatic series, such as school data, that are available 
in peacetime as well as in war years. A further delaying influence was 
the prospect of a national sample survey of population in October, 
1946, from which population totals for States and large cities would 
have been forthcoming. The Congress appropriated money for the 
preparatory work on this survey, but in the following fiscal year it 
failed to approve the necessary remaining funds. From individual 
sample surveys in the fall of 1946 and April, 1947, the Census Bureau 
obtained estimates of total population for selected large cities and 
metropolitan districts, however. These will be discussed later. 

Meanwhile, the demand for State and local estimates was mounting. 
People were conscious of the sweeping changes that had occurred since 


^ U. S. Bureau of the Census. PojtiAaiion. "Estimates of the civilian population by counties: May 1, 

1942. ” Series P-3, No. 33, February 25, 1943. 

U. S. Bureau of the Census. Population. "Estimates of the civilian i)opulation by counties: Mardi 1, 

1943. ” Series P-3. No. 38. October 31, 1943. 

IT. S. Bureau of the Census. Population. "Estimated civilian xiopulation of metropolitan counties, 
by single counties: March 1, 1943, and May 1,1942.” Series P-3, No. 40. January 7, 1944. 

U. S. Bureau of the Census. Population—Special BeporU. "Estimated civilian population of the 
United States, by counties: November 1, 1943.” ^ries P-44, No. 3. February 15,1944. 

* Hauser, Philip M., and Tapping, Benjamin J., "Evaluation of census 'wartime population estimates 
and of predictions of postwar population prospects for metropolitan areas.” American Sociological Re¬ 
view 9 (5): 473-480. Oct., 1944. 



STATE AND LOCAL POPULATION ESTIMATES 


159 


the outbreak of the war, and the importance of measuring them. Vital 
statisticians, market analysts, directors of planning boards, and a host 
of others had mailed or telephoned their urgent requests. The time of 
our small estimating staff was taken up with a wide variety of national 
estimates, and State estimates were next on the list. Within the fore¬ 
seeable future, we knew we could not make estimates for all the 
counties and middle-sized cities. Many State health departments just 
had to have such estimates, however; and the question of giving them 
technical advice was raised. 

METHODS USED RECENTLY 

In 1946, Dr. Hope T. Eldridge, then of the Population Division of 
the Census Bureau, spoke before the Council on Vital Records and 
Vital Statistics. She described a method for making postcensal county 
estimates on the basis of school data, vital statistics, and figures on the 
armed forces. Later she expanded this paper, and it was published as 
^'Suggested procedures for estimating the current population of 
counties.”* A second and more detailed method was also outlined in this 
article. 

The second method, slightly modified, w’^as used in making most of 
the published State estimates for 1946 and 1947.^ Estimates by both 
methods were also computed for all the cities surveyed in the autumn 
of 1946 and the spring of 1947 in order to have independent checks on 
the inflated sample figures. The prime purpose of the present report is 
to present what has been learned about these methods and their pit- 
falls. The Census Bureau would also be greatly interested in hearing 
of the experience of other workers with these methods. 

The methods described in Dr. Eldridge’s article may be summarized 
briefly here. Detailed examples of each method are available on request 
to the Bureau of the Census. The simpler method. Method I, assumes 
that the difference between the percentage change in elementary school 
enrollment for the local area and the national percentage change in 
population of elementary school age is equal to the percentage change 
through net migration to or from the local area. The base date for a 
State, city, or county may be November, 1943, or any other date for 


> Bureau of the Census Populahon—Special ReporiSt Series P-47, No. 4, April 30,1947. 

4 U S Bureau of the Census. Current Populatum Reports, PopiUatum EsHmates. "Estimates of the 
population of the United States, by regions, divisions, and States: July 1, 1940 to 1947.” Series P-25, 
No. 12. August 9,1948. This release also contained revised estiinates for July 1,1940 to 1945. The earlier 
State estimates for July, 1944 and 1945, vrere to a considerable extent simply extrapolations of those 
for November, 1943. Subsequent data on natural increase and shifts between civilian and military 
populations had been utilised, but net mterstate migration had been estimated from that for the period 
1940 to 1943. 




160 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE ig49 


which the total population is available. Other figures needed to 
complete the estimates, namely, natural increase, persons away in the 
armed forces on the base and estimate dates, and members of the 
armed forces stationed in the county on the estimate date, are handled 
in a fairly direct manner. 

Method II differs from Method I only in a more elaborate technique 
of estimating net migration from the school data. By this technique, 
net migration is measured on the basis of the difference between the 
population of elementary school age as estimated from the school data 
and the expected population of that age had there been no migration 
since the base date. The expected population is computed by applying 
survival rates from an appropriate life table to the population cohort 
at the base date that became the population of elementary school age 
at the estimate date. This "aging” process must also be applied to 
cohorts of births occurring after the base date when estimates are 
being made for the later postcensal years. To illustrate, the expected 
population 6 to 13 years old on April 1, 1947, assuming no migration, 
comprises sundvors of births between April 1, 1940, and March 31, 
1941, as well as survdvors of children under exact age 7 on April 1, 
1940. Since data on the population by age are required for the base 
date, it is usually necessary to go back beyond 1943 to the last census. 
The “actual” population of elementary school age with which the 
expected population is compared is computed from the school datum 
for the estimate date by means of an appropriate ratio based on past 
experience. Some assumption must be made about the relation of the 
rate of net migration of the school-age population and that of the 
population of all ages, the simplest assumption being that the rates 
are the same. This is a critical assumption and is discussed more fully 
below. 

COMPARISON OP METHODS 

Comparisons have been made between estimates made by these 
methods and population figures from other sources. These sources 
include the following: (1) Special censuses (complete counts) taken by 
the Census Bureau at the request and expense of local governments; 
(2) Censuses of Congested Production Areas taken in 1944 by the 
Bureau in selected metropolitan counties—^these were sample censuses 
with a coefficient of variation of less than one per cent for the total 
population of each Congested Area; (3) State censuses; (4) Estimates 
made from the registration for war ration books. 

Estimates made by Methods I and II have been compared with 
population standards of these sorts for 48 States, 25 cities, and 102 



STATE AND IiOCAL POFtnUATlON ESTIMATES 


161 


counties. Some cities and counties appear more than once in the list, 
for different dates. Some of the standards are complete counts. Other 
standards are subject to a fairly large amoimt of error; but, on the 
whole, they should be quite accurate. The population standards for 
States are the November, 1943, estimates based on War Ration Book 
Number 4 and published in Series P-44, No. 3. For the 25 cities, the 
standards were as follows: Ration Book estimate, 7; State census, 7; 
Census of Congested Production Areas in 1944, 9; and Census Bureau 
special census, 2. The standards for the 102 counties were distributed 
as follows: Ration Book figure, 43; State census, 42 (4 in Florida, 35 in 
Kansas, and 3 in Massachusetts); Census of Congested Production 
Areas, 16; and Census Bureau special censuses, 1. Similar population 
standards were available for many other cities and counties, but school 
data had not been collected for them. All figures are for the civilian 
population. The estimate dates range from 1943 to 1947. 

The results of these two empirical methods may also be compared 
with those from two methods that do not use local postcensal data. 
The first of these latter is simple arithmetic projection. The total 
population of each area was projected on the basis of the 1930 to 1940 
trend. This figure was converted into one for the civilian population 
by assuming that the area had the same proportion of its de jure 
population in the armed forces as the national population did at the 
same date. The second method is the apportionment method used by 
the Bureau of the Census in the early 'thirties. It assumes that the 
area had the same share of the national increase in the postcensal 
period as in the last intercensal period. Because the civilian population 
decreased over most of the period in which we are interested, the ap¬ 
portionment method could not be applied m the form just described. 
We modified the usual method by using the total population, which as 
a final step was converted into the area’s civilian population, again by 
assuming the same proportionate loss to the armed forces by each 
area as was lost by the IFnited States as a whole. (In the case of areas 
that had lost population between 1930 and 1940, it was assumed that 
there was no further decline in the toted population after 1940. This 
total was reduced as just described to get an estimate of the postcensal 
civilian population.) 

There is not space to present here the various estimates and their 
deviations from the population standard for each of the 175 States, 
cities, and counties. A summary of the deviations is given in Table 1, 
however, for each type of area. Happily, the extra work of the school- 
data methods of estimation appears to be decidedly worthwhile. In 



162 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


TABLE 1 

SUMMARY OP PERCENTAGE DEVIATIONS PROM POPULATION STANDARD OP 
ESTIMATES BY FOUR METHODS POR SELECTED AREAS AND DATES 
(See text for description of methods) 


Item 

Number 
of Areas 

Method 

I 

Method 

II 

Arithmetic 

Projection 

Apportion¬ 

ment 

Average percentage deviation 
(disregarding sign) 






All areas. 

175 

7.06 

6.48 

12.08 

12.17 

States. 

48 

4.50 

3.08 

6.14 

6.09 

Cities. 

25 

6.76 

5.13 

12.89 

11.34 

Counties. 

102 

8.34 

8.40 

14.67 

15.24 

Root-mean-square percentage 
deviation 






All areas. 

175 

9.43 

9.29 

16.89 

16.75 

States. 

48 

5.32 

4.29 

7.44 

7.29 

Cities. 

25 

7.75 

7.35 

14.54 

12.96 

Counties. 

102 

11.16 

11.23 

20.31 

20.37 

Deviations of 10 per cent or above 






All areas. 

175 

43 

40 

82 

83 

States. 

48 

3 

2 

10 

9 

Cities. 

25 

7 

3 

16 

13 

Counties. 

102 

33 

35 

56 

61 

Positive deviations 






All areas. 

175 

92 

92 

70 

75 

States. 

48 

32 

29 

28 

28 

Cities. 

25 

9 

10 

1 

2 

Counties. 

102 

51 

53 

41 

45 


each type of area, the arithmetic mean of the percentage deviations 
disregarding sign is lower for both Method I and Method II than for 
either of the two mathematical methods, arithmetic projection and 
apportionment. (This fact does not gainsay that for some individual 
areas the percentage deviation *was lower on the part of the mathe¬ 
matical than of the school-data method.) Average “errors” of 6 or 
7 per cent over a period of 3 to 7 years by the better methods may not 
seem satisfactory for some purposes, but at least they compare favor¬ 
ably with the 12 per cent that woiild have resulted from the use of 
conventional methods. 

It is of interest to investigate whether any of the differences between 
the average per cent deviations shown in Table 1 might reasonably be 





























STATE AND LOCAL POPULATION ESTIMATES 


163 


attributed to chance variation. In computing the significance of the 
average percentage deviation for one method as compared with another, 
one should take account of the correlation between the deviations by 
the one method with those by the other. It is not surprising that there 
is a correlation between the deviations by Methods I and II and also 
between those by arithmetic projection and apportionment. (For all 
areas, there is a correlation coefficient of +.82 between the first two 
methods and of +.95 between the last two.) It is perhaps surprising, 
however, that there are also significant positive correlations between 
the deviations of a school-data method and those of a mathematical 
method, r ranging from +.36 to +.47. All correlations are positive 
within each type of area. States, cities, and coxmties. In other words, 
since the mathematical and empirical methods have no common 
assumption, a population that is difficult to estimate by one type of 
method also tends to be difficult by the other. 

In testing the significance of the difference between the averages 
of two sets of deviations, we have in all cases allowed for the positive 
correlation by getting the difference between paired percentage devia¬ 
tions (disregarding sign) from two methods and testing whether the 
mean difference is significantly larger than zero. We thus obtain for all 
areas combined: 



f 

P 

Between Method I and Method II (7.06 ts 6.48) 

1.98 

.05 

and Arithmetic projection (7 06 va 12.08) 

5.91 

<.001 

and Apportionment (7.06 vs 12.17) 

6.24 

<.001 

Method 11 and Arithmetic projection (6.48 vs 12.08) 

7.02 

<.001 

and Apportionment (6.48 vs 12.17) 

7.16 

<.001 

Arithmetic projection and Apportionment (12.08 vs 12.17) 

0.34 

.7 


Thus the superiority of either of the school-data methods over either 
arithmetic projection or apportionment is almost certainly not due to 
chance. The greater accuracy of Method II than of Method I may be 
deemed of border-line statistical significance as far as this test is 
concerned. 

There may be some doubt about the propriety of combining all three 
types of areas in the above examination. Table 1 shows, however, that 
the populations of States, cities, and counties are alike in being more 
accurately estimated by a school-data method than by a mathematical 
method. For States, cities, and counties considered separately, the 
probability that these differences arose by chance alone is always less 
than .01 and usually less than .001. Method II is significantly more 






164 


AMEBICAN STATISTICAli ASSOCIATION JOTTBNAIi, lUNB 10^ 


accurate on the average than Method I only for States. Its lower aver¬ 
age “error” for cities could have occurred by chance about 1 in 10 
times, and for coimties the average accuracy is about the same. 

Two measures presented in Table 1 stress the larger percentage 
deviations from the population standard, whether positive or negative. 
The square root of the mean-squared percentage deviation when used 
as a criterion does not change our evaluation of the relative accuracy 
of the four methods very much. Percentage deviations of 10 per cent 
or more are next shown, and once again the school-data methods with 
their separate handling of migration and natural increase appear more 
accurate than the mathematical methods. By neither criterion, how¬ 
ever, is Method II significantly more accurate than Method I. 

The tests employed above, of course, depend for their validity on 
the assumption that we are dealing with functions of normally distri¬ 
buted variables. The failure of these tests to differentiate conclusively 
between the merits of Methods I and 11 might, therefore, be due to a 
failure of the normality assumption. To check on this point a test must 
be used in which the level of significance does not depend on the distri¬ 
bution law of the variables concerned. Recent developments in the 
theory of order statistics have provided such tests.® The application of 
such a test indicates that estimates prepared by Method II have an 
average absolute deviation that is definitely smaller than that for 
estimates prepared by Method I. When the estimates for States, citie^, 
and coimties combined are considered the difference proved s^nificant 
at the .02 level. For States alone the difference proved significant at the 
.005 levd, and for cities alone at the .05 levdi. For counties alone the 
difference was not significant. Accordingly, it appears that the addi¬ 
tional work required in using Method II instead of Method I will often 
be justified. 

For the cities and counties, the deviations of the estimates from the 
school-data methods are more evenly divided between positive and 
n^ative values than are those from tlie other methods. This situation 
is not true for the States. The higher rate of growth in the ’forties than 
in the ’thirties is particularly apparent in the nearly universally nega¬ 
tive deviations for cities shown by the mathematical methods. 

In the estimates made by Method II, the school datum was trans¬ 
lated into an estimate of the actual population of dementaiy school 
age by applying the 1940 ratio of the two statistics. Since this ratio 
had been variable prior to 1940, it was supposed that the accuracy of 


s John E. Wal^, *Sonis aignificanee tests for the median whit^ are valid under -vexj general condi* 
tions.” nzqpubilished doctoral dissertalwni Princeton Univ^nsity libraiy, Itinoeton. N. J. 



STATB AND LOCAD POPTTI.ATION ESTIUATBB 


165 


the final population estimates would be improved if the value of this 
ratio were extrapolated in some other way to the given postcensal date. 
We did this in two ways: (1) linear extrapolation of the trend in the 
ratio between 1930 and 1940; (2) logarithmic extrapolation. The latter 
assumes that the amount of change in the ratio has been progressively 
smaller. These two methods of extrapolation have been named Method 
Ila and Method Ilb. 

In Table 2, the final results (in terms of percentage deviation of the 
estimate from the population standard) from Methods Ila and lib 
are compared with those from the original Method II, in which the 
ratio under discussion was held constant. A consistent series of school 
data was not available from 1930 to the estimate date for aU areas, 
but the methods could be compared for a total of 132 States, cities, 
and counties. 


TABIiB s 

SUMMARY OF PERCENTAGE DEVIATIONS FROM POPULATION STANDARD OF 
ESTIMATES BY THREE VARIATIONS OF THE *MIORATION AND NATURAL 
INCREASE' METHOD FOR SELECTED AREAS AND DATES 


Item 

Number of 
Areas 

Method II 

Method Ila 

Method Hb 

Average percentage deviation 





(disregarding sign) 





All areas. 

132 

6.20 

7.13 

7.68 

States. 

32 

3.38 

4.67 

4.72 

C^tios. 

19 

3.98 

4.59 

4.53 

Coimties. 

81 

7.83 

8.70 

9.58 


It may be readily observed that these refinements of Method II do 
not improve on the original. It would appear that there was either very 
little change in the ratio of elementary school enrollment to population 
of elementary school age after 1940 or else the change was of a quite 
different nature from that between 1930 and 1940. Until at least 1950, 
we may safely spare ourselves the extra work involved in these par¬ 
ticular refinements of Method II. On the other hand, the extra work of 
Method II as compared with Method I seems to offer promise of more 
accurate population estimates. 

A second source of data that may be used to assess the comparative 
accuracy of Methods I and II is available in the results of sample 
surveys taken by the Bureau of the Census in October and November, 
1946, and in April, 1947. In 19 cities twice the coefficient of variation 
for the sample estimate of the civilian population of the city concerned 



















166 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

was about 6 per cent. Forty-two other surveys were based on smaller 
samples and are subject to still larger sampling errors. Moreover, the 
sample estimate for each of the 61 surveys was subjected to an upward 
adjustment of about 4 per cent on the assumption that this adjustment 
would make the results more comparable with those that would be 
obtained by a complete census. Evidence from national sample surveys 
indicates that an upward adjustment of about 4 per cent is needed to 
make the national Current Population Survey estimates comparable 
with those which would be obtained from a complete census. The case 
for applying the same adjustment to the results of local surveys, al¬ 
though convincing, was not altogether conclusive; and, in any event, 
the adjustment may well have been too high or too low for particular 
cities.® Consequently these sample estimates cannot be regarded as 
having the same status as the population standards referred to pre¬ 
viously. We can merely examine the general consistency of the sample- 
based estimates with the result for each of the two synthetic methods. 
Such comparisons are presented in Table 3. 

TABLE 3 


SUMMARY OP PERCENTAGE DEVIATIONS PROM SAMPLE CENSUS PIGURES 
OF ESTIMATES BY METHODS I AND H: 61 SELECTED CITIES. 1946 AND 1947 


Item 

Method I 

Method 11 

Averag*^ pArnentage deviatinn (disregarding sign) r . 

7.6 

4.9 

Root-mean-SQuare percentage deviation. 

9.3 

6.3 

Deviations of 10 per cent or above... 

15 

6 

Positive deviations. 

19 

32 

Negative deviations. 

42 

29 



Measured by both the average deviation and the root-mean-square 
percentage deviation, the more elaborate Method II gave results more 
consistent with the sample estimates than did Method I. The respective 
average deviations were 4.9 and 7.6 per cent; the respective root-mean- 
square percentage deviations were 6.3 and 9.3 per cent. These compari¬ 
sons are based on 61 cases. When we pair the deviations (disregarding 
sign) from the sample estimate for Method I and Method II, compute 
the mean difference of the paired deviations, and examine by the t-test 
whether this mean difference (2.7 per cent) is significantly different 
from zero, we find that the probability that the deviations from the 
two methods could have come from the same universe is less than .001. 


^ s If this adjustment had been as little as 1 per cent, the conclusions set forth bdow would not have 
been changed materially. 












STATE AND LOCAL POPULATION ESTIMATES 167 

A very low probability was also obtained by the previously cited Walsh 
test. 

Our preferred school-data estimates also come much closer than 
the simpler school-data estimates to an equal distribution of positive 
and negative deviations from the sample estimates. Method II has 
32 positive and 29 negative deviations, whereas Method I has 19 and 
42. 

Thus the evidence seems to confirm our choice of Method II. Of 
course, some third method might be even better. Where the school-data 
estimate and the sample estimate for a city agree, we have some feeling 
of confidence in both since they are from entirely independent meth¬ 
ods. Where the estimates disagree, either or both may be in error. 

EVALUATION OP SCHOOL DATA 

Inasmuch as both Methods I and II depend largely on school 
statistics for the measurement of net migration, serious errors of esti¬ 
mation can creep in at the very beginning of the estimating process 
unless the school data used are assembled with the main purpose al¬ 
ways in mind. In either method school statistics are needed to ap¬ 
proximate as closely as possible the number of children in a fixed age 
group. Some types of school data do so better than others. For example, 
statistics relating only to children of compulsory school age are more 
useful than those relating to all school children regardless of age since 
many of the latter go to school only by choice. Thus, data on enroll¬ 
ment by age are to be preferred over data on enrollment by grade for 
the purpose of making population estimates. If the former are not 
available, as is usually the case, grade enrollment figures are next to be 
preferred over other types of pupil statistics. But when such figures 
are used they should be restricted to the elementary grades, which 
comprise pupils most of whom are within the ages of compulsory 
school enrollment. 

Two other types of school statistics are usually available. The first, 
average daily attendance, should be avoided on principle. It is obvious 
that an increase in average daily attendance may reflect no increase 
in enrollment or population but merely improved school administration 
or better weather during the school year. Over the long run, improved 
economic conditions or changes in parental attitudes conducive to more 
regular attendance at school may also be important factors. As to the 
second of these types of data, the school census, experience has taught 
us to proceed cautiously before using it to estimate population. It is no 
simple problem to enumerate the population of an area with reasonable 



168 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

completeness even when trained enumerators do the job under well 
organized administrative controls. Yet, in most instances, school 
censuses are taken by school teachers, policemen, or unpaid volunteers, 
often without the geographic and other administrative controls needed 
to assure reasonably complete enumeration. As a result, school census 
data frequently vary in accuracy from year to year making it diflSicult 
to estimate the population in an age group by reference to the relation¬ 
ship between the school census and the United States census. Children 
outside the compulsory enrollment ages are especially likely to be 
imderenumerated relative to a federal census. Frequently school 
census data are not tabulated by age so that the totals shown com¬ 
prehend several age groups among which the completeness of enumera¬ 
tion varies greatly. Instances have even come to our attention of re¬ 
ports involving the publication of mere estimates improperly labeled as 
school census returns. In view of these difficulties, the school census 
has only rarely been used by the Census Bureau, and then only after 
careful investigation of the data. 

A series that closely reflects the size of a fixed age group may never¬ 
theless contain a defect from the standpoint of measuring net migra¬ 
tion for a given area. One such situation occurs when the area for which 
the estimate is being prepared is not the same as the area for which the 
school statistics are compiled. School districts, it is true, are legally 
constituted areas, but they are not always the political areas for which 
population estimates are desired. Thus school statistics for the San 
Antonio Independent School District include some pupils who do not 
live in the city of San Antonio and exclude some who do live there. To 
use the local figures, a school-by-school tabulation was required to 
eliminate those not in the city. Another instance in which the area of 
jurisdiction of the school system is a complicating factor is found when 
a city has annexed one or more suburban areas since the last decennial 
census. The legal status of the annexed area changes at an instant in 
time, but the necessary administrative arrangements to consolidate 
the annexed school into the city school system may take some time to 
complete. It may be a year or more before the pupils in the annexed 
area are reflected in the statistics for the city. 

Another stumbling block in the use of school statistics is a concealed 
inconsistency. Frequently a series of school data appears reasonable 
and consistent, but subsequent investigation reveals an administrative 
act which renders the figures not only useless but misleading. Thus, in 
one instance, a sudden increase in elementary enrollment proved to 
represent nothing more than a change from a seven- to an eight-grade 



STATS AND LOCAL TOPtTLATION SSTIMATSS 


169 


elementaiy school system. There are several different ways in which 
such a change-over might be made; each will leave its impression on 
the statistics. In the instance mentioned, the impact of the change-over 
was felt all at once, creating a suspicion on our part that could be 
checked. If the impact is spread over many years, it may not be noticed 
and the data may appear to indicate a gradual population increase 
when all that is involved is an administrative change in the organization 
of the school system. Similarly the opening of kindergartens or the un- 
annoimced introduction of kindergarten pupils into the enrollment 
statistics will produce a misleading increase in a school data series. (In 
general, kindergarten pupils should be excluded from the school 
statistics because attendance for them is not compulsory. For similar 
reasons, children less than 7 years old, or at least those less than 6, 
should be omitted when the data are classified by age.) Still another 
illustration of essentially the same situation is the omission from the 
school data series of pupils in special or ungraded classes. The omission 
might become a matter of consequence if the school system is in the 
process of expanding such programs and is steadily reclassifsrmg its 
pupils and assigning them into these classes from the regular classes. 

One factor about which statisticians have been apprehensive has 
usually proved to be unimportant. This is the presence of noiuesidents 
in the local schools. Nonresidents, however, are seldom found in ele¬ 
mentary classes because of the travel hazards for young children. 
Children from out of town are more frequently found in high schools; 
this fact is another reason for not using high school enrollment data. 

The extent to which such circumstances as those mentioned can 
afilect the population estimates may sometimes destroy whatever value 
inheres in the method of estimation. One of the concealed incon¬ 
sistencies cited above resulted in an estimate of population 14 per cent 
greater than that obtained when the inconsistency was eliminated. 
Anyone using school statistics to estimate population must not only 
be aware of the requirements for data imposed by the estimating 
technique but must also be a statistical detective, going behind the data 
to the operating system from which they were derived, and understand¬ 
ing the changes taking place in educational administration as they 
affect school statistics. 

In many areas, a substantial proportion of the population of ele¬ 
mentary school age does not attend public school but receives its 
education at parochial or other private schools. 

The importance of using enrollment figures that include private as 
well as public schools may be demonstrated by comparii^ estimates 



170 


AMERICAN STATISTICAIi ASSOCIATION JOURNAL, JUNE 1949 


based on total enrollment figures with those based on public enrollment. 
Prom the 61 cities of Table 3,47 in which private enrollment constituted 
at least 10 per cent of total enrollment were selected. Two sets of 
estimates were then made by Method II. The results were compared 
with the population returned by the sample censuses already discussed. 

TABLE 4 

SUMMARY OP PERCENTAGE DEVIATIONS FROM SAMPLE CENSUS FIGURE OP 
ESTIMATES BY METHOD H. USING TOTAL ENROLLMENT AND PUBLIC 
ENROLLMENT: 47 SELECTED CITIES, 1946 AND 1947 


Item 

Using Total 

TComllment 

Using Public 
Enrollment 


4.7 

8.9 

Root'meazLHsqii&re percentage deviation. 

5.8 

10.1 

Deviations 10 per cent or above... 

3 

19 

Positive deviations. 

23 

8 

Negative deviations.,.... 

24 

39 



When parochial, and sometimes other private, enrollment is taken 
into account, the population estimates are significantly more consistent 
with estimates obtained by sample censuses. The average absolute 
percentage deviation for the estimates based on both public and private 
enrollment (4.7 per cent) was significantly smaller (P<.001) than the 
average based on public enrollment alone (8.9 per cent) by either the 
t-test or the test devised by Walsh. The improvement is so pronounced 
that a strong case is made for collecting school data covering parochial 
and other private schools. The fact that Method II using public school 
data tends to give low estimates reflects the shift in recent years away 
from public schools and to parochial schools. 

OTHER SOURCES OF ERROR IN METHOD H 

Method II almost always requires the use of census statistics on 
children under 6 years old. In the census of 1940, as in most other 
censuses here and abroad, it is known that many children of this age 
were not enumerated- The census figures therefore have to be corrected 
for underenumeration of young children before the number of expected 
sundvors is computed. Failure to do so would result in understatement 
of the expected population of school age and a commensurate decrease 
in the estimated net migration. 

There are published measures of underenumeration for States, urban 
and rural; but for other areas one must select the measure that seems 
most appropriate from a priori considerations. There are, of course, 
intercounty variations in underenumeration within a State just as 
















STATE AND LOCAL POPULATION ESTIMATES 


171 


there are in the completeness of birth registration. The variation is 
probably greatest in the States where underenumeration is greatest. 
For 1940 the estimated completeness of enumeration ranged from 
79.9 per cent for South Carolina nonwhites to 98.0 per cent for Neb¬ 
raska and South Dakota whites.^ 

If one is dealing with areas smaller than States, it usually will be 
necessary to distribute some age groups by single years of age since the 
single-year data were not published. This step can be performed fairly 
well on the basis of expected survivors of resident births during the ap¬ 
propriate years preceding the census. Even in areas with heavy migra¬ 
tion, only minor errors should be introduced by this assumption. 

Another point at which relatively minor errors may occur is in 
“aging” the corrected population distributed by single years to the 
estimate date. 

A life table for the given area covering the given time usually will 
not be available. Hence the most appropriate life table must be selected. 
In some cases, it may seem desirable to adapt an existing life table. 
Recent life tables published by the federal government cover: (a) re¬ 
gions, urban and rural, by color and sex, for 1939;® (b) geographic 
divisions, by color and sex, for 1930 to 1939;® and (c) the white popula¬ 
tion of States, by sex, for 1939 to 1941.^® 

Although there is a fair amoimt of variation among areas in infant 
and child mortality rates, there is not much variation percentagewise 
in their complement, survival rates. For instance, according to 1929- 
1931 life tables for white males, by States, the proportion of the cohort 
imder 7 years old surviving to be 7 to 13 years old varies only from .976 
for Colorado to .986 for South Dakota. Nonetheless, it would be best to 
construct a life table for the area concerned, if it is large enough to have 
stable mortality rates. Actual deaths should not be subtracted from 
the population of the cohort at the base date since they include those of 
in-migrants and exclude those of out-migrants. It will be recalled that 
we are estimating the expected survivors assuming no migration. 

Still another source of error arises from the fact that the ratio be¬ 
tween the population of elementary school age and elementary school 
enrollment for any area is not constant over time. For States the 


^ U. S. Bureau of the Census. Popviaiion. 'differential feztility; 1940 and 1910. Standardized 
fertihty and reproduction rates.” p. 33. Washington, Government Printing OfiSlce, 1944. 

> U. S. Bureau of the Census, d. S. Abridged Life Tables, 1939, Urban and Rural, by Regions, 
Color, and Sex.” June 23, 1943. 

* U. S. Bureau of the Census, d. S. Abridged Life Tables, 1930-39 (Prdiminaiy), by Geographic 
Divisions, Color, and Sex.” April 30,1943. 

^0 National Office of Vital Statistics. "State and Regional Life Tables, 1939-41.” Washington, 
Government Printing Office. 1948. 




172 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


average change in the ratio of the population 7 to 13 years old to public 
elementary school enrollment was +0.038 between 1930 and 1940. 
Disregarding sign, the average change was 0.047. The range was from 
—0.052 to +0.141. It has been shown in Table 2, however, that during 
the ^forties more accurate population estimates are obtained on the 
average by assuming that the 1940 ratio did not change than by ex¬ 
trapolating the 1930-1940 trend in this ratio. 

We come now to what is probably the major assumption in our 
method. This is that the net rate of change through migration for 
elementary school-age children is equal to the net migration rate for 
the population of all ages. Migration is quite selective of certain age 
groups, of course; but there is some evidence that children of elementary 
school age, who move with their parents, are of intermediate mobility 
between young adults and the aged. Insofar as child migration is near 
the average rate, it may be used to represent the rate for the whole 
population. 

Past experience in the pattern of net migration rates by age may be 
studied. From 1935 to 1940, the net interstate migration of children 5 
to 13 years old was in the same direction as that of the total population 
in all except four States and the District of Columbia. Among the 44 
States where net migration was in the same direction, the rate for chil¬ 
dren 5 to 13 years old averaged 91 per cent of that for all ages. The 
range was from 40 per cent to 167 per cent, with a standard deviation 
of 35 percentage points. 

Among the 90 cities of 100,000 or more in both 1940 and 1930, the 
net migration of children 5 to 13 years of age was in the same direction 
as that of the total population in all except 5 cases. In these 85, the rate 
for the children averaged 145 per cent of that for all ages. The range 
was from 3 per cent to 400 per cent, the standard deviation being 69 
percentage points. 

From these facts, it can be appreciated that there is a great deal of 
variation in the relationship betvreen the child-migration rate and the 
rate for the population of all ages. It is thus easy to make a large error 
in estimating the total net migration from the migration at the age of 
school enrollment. We can, of course, use the ratio observed in the 
particular State or city with which we are dealing. We do not know, 
however, how representative the ratio for 1935 to 1940 is for subsequent 
periods. There are still other disturbing factors. One is that in 
current postcensal estimates we have to deal with periods ranging from 
one to nine years instead of just five years. Another is that the age 
group chosen to represent school enrollment may not be 5 to 13; for 



STATE AND LOCAL POPULATION ESTIMATES 


173 


instance it may be 7 to 14. Late in the decade, the school-age cohort 
includes some children bom since the last census. This part of the co¬ 
hort has had a shorter period of exposure to "risk” of migration than 
have persons of older ages. Fortunately, there is never a very large 
difference in the average length of exposure between the whole school- 
age cohort and the population of all ages. 

A great deal of investigation remains to be done on the age selectivity 
of net migration for different areas over various periods of time. So far 
we have relatively few statistics on net migration by age. If we have 
taken care of the other sources of error in the method so that the 
migration assumption is the only important item of guesswork remain¬ 
ing, we can do some experimentation by trial and error. We can see 
what migration assumptions produce the most accurate population 
estimates, using a complete census or other reliable figure as the 
standard. 

The foregoing discussion of sources of error was concerned with the 
estimation of net migration. The estimation of the actual natural in¬ 
crease is beset with relatively minor pitfalls. The available figures on 
under-registration of births are subject to some error. It is usually 
assumed that deaths are almost completely registered. There may also 
be some inaccuracy in the allocation of births and deaths to the correct 
area of usual residence. 

The Census Bureau^s State estimates, which are prepared essentially 
by Method II, are adjusted to total to a national estimate, which we 
consider to be quite comparable to a decennial enumeration. Similarly, 
county estimates made locally may be scaled up or down to a State 
total. Such controls reduce the chance of very large errors. Whenever 
censuses or better estimates are available for particular local areas, 
they should, of course, be substituted. For example, the method we 
have described proved inappropriate for Washington, D. C., because 
of the unusual character of migration to this city. As an experiment, 
this Bureau used a wide variety of statistics in preparing the estimates 
recently published.^^ We feel that although our school-data method is 
promising enough to warrant extensive experimentation and polishing, 
concurrent exploration of other sources and methods should be con¬ 
ducted. 

U. S. Bureau of the Census. Popidation—Special Reporta. “Estimated population of tbe Washing¬ 
ton, D. C., Metropolitan Counties: 1940 to 1946/ Series r*47» No. 5. May 14,1947, 




THE USES AND USEFULNESS OF BINOMIAL 
PROBABILITY PAPER* 

Frederick Mosteller 

AND 

John W. Tukey 

Harvard University and Princeton University 

This article describes certain uses of Binomial Probability 
Paper. This graph paper was designed to facilitate the employ¬ 
ment of R. A. Fisher's inverse sine transformation for propor¬ 
tions. The transformation itself is designed to adjust bino- 
mially distributed data so that the variance will not depend on 
the true value of the proportion p, but only on the sample 
size n. In addition, binomial data so transformed more closely 
approximate normality than the raw data. 

The usefulness of plotting binomial data in rectangular co¬ 
ordinates, using a square-root scale for the number observed 
in each category, was first pointed out by Fisher and Mather 
[10]. The graph paper under discussion^ is specially ruled to 
make this mode of plotting both simple and rapid. A gradu¬ 
ated quadrant makes the angular transformation (p=cos‘ ^ 
or p — sin* 0) easily available at the same time. Most tests of 
counted data can be made quickly, easily and with what is 
usually adequate accuracy with this paper. Some 22 examples 
are ^ven. 

PART I—GENERAL 

INTRODUCTION 

I N ANALYZING data which has been counted rather than measured the 
biologist, opinion pollster, market analyst, geologist, physicist, or 
statistician frequently uses as a model the binomial distribution, its 
limiting case the Poisson distribution, or some of their generalizations. 
For many purposes graphical accuracy is sufficient. The speed of graph¬ 
ical processes, and more especially the advantages of visual presenta¬ 
tion in pointing out facts or clues which might otherwise be overlooked, 
make graphical analysis very valuable. A special type of graph paper 
has been designed and is now available for such analysis. 

The examples below show how such paper can be used in various 
computations—clearly there are many parallel cases in other fields of 
analysis. 


* Prepared in connection -with research sponsored by the Office of Naval Eeseardi. 

^ Available from the Codex Book Company, 74 Broadway, Norwood, Mass. No. 31,298 on thin 
paper. No. 32,298 on thick paper. 


174 




binomial peobabiiity paper 175 

The basic idea involved is due to R. A. Fisher, who, many years ago, 
observed that the transformation 

COS^j = — 

n 

transformed the multinomial distribution with observed numbers 
Wi, ^ 2 , • • •, Wfc into direction angles ^i, ^ 2 , * • • ,<l>k which were nearly 
normally distributed with individual variances nearly l/4w (when the 
angles are measured in radians). Thus the point at a distance Vn from 
the origin and in the direction given by <l>ij ^ 2 , • • •, is distributed on 
the (fc—1) dimensional sphere nearly normally, and with variance 
nearly J independent of n and the true fractions pi, P 2 , • • •, p* of the 
different classes in the population. The rectangular coordinates of this 
point are Vwi, 'n/w 2 , • • •, Vw*- 

In 1943 Fisher and Mather [10] applied this principle to the graphical 
plotting of the square roots of the observed numbers of Lythrum sali- 
caria styles of various t 3 rpes on ordinary graph paper. After reading 
this article, the present authors were convinced that special graph 
paper, designed to facilitate this process would be worthwhile and de¬ 
signed the paper [12] mentioned above—^whose use forms the subject 
of this article. 

Binomial Probability Paper is graduated with a square-root scale on 
both axes 1(1)20(2)100(6)400(10)600 and 1(1)20(2)100(6)300. (The 
notation A{a)B(J))C means: “From A to R by intervals of size a and 
thence from jB to C by intervals of size h”.) It serves to treat classifica¬ 
tion into two categories directly. Cases with three categories would 
require three-dimensional graphing for direct treatment, but a device 
not explained in this paper enables one to treat any number of cate¬ 
gories on this paper. 

PLOTTING 

The binomial distribution 

The classical example of the binomial distribution is the distribution 
of the number of heads in n independent throws of a coin, true (60-50), 
or biased (for example 40-60 or 57-43). Two assumptions require 
emphasis here (i) the throws are independent and do not influence one 
another, (ii) the probability of a head on each throw is the same. Re¬ 
place “heads” and “tails” by “successes” and “failures,” or by “regu¬ 
lar females” and “exceptional females,” or by “alcoholic” and “non¬ 
alcoholic,” or by any classification into two categories and these two 



176 AMBBICAN STATISTICAL ASSOCIATION JOCBNAL, JUNE 1949 

requiremeats, suitably modified, are the requisite conditions that the 
observed numbers in the categories will be distributed binomially. 

We must distinguish (i) the true relative probability of two cate¬ 
gories, which we will often refer to as the split and (ii) the observed 
numbers in the two categories, which we will often refer to as the 
paired count. In describing a split we do not bother with normalization 
—the split of a “true” coin is 60-50 or 1-1 or 127-127. In describing a 
paired count we always give the actual numbers counted. 

Binomial plotting 

A split is plotted as a straight line throi^ the origin—it passes 
throu^ all the points whose coordinates represent it. Thus in Figure 1 
the lowest line corresponds to the 80-20 split. In addition to going 
through (80,20) this line goes through the points (8, 2), (16,4), (20, 5), 
(120,30) and so on—^the corresponding splits are all Hie same. The 77- 
23 split is also called the 23% line, and the (100-X)-Z split is also called 
the X% line. 

A paired count is plotted as a triangle whose sides extend for one scale 
unit in the positive directions of the axes from its right angle, which has 
the paired count for its coordinates. Thus (1, 3) is plotted as the tri- 
ai^le with vertices (1, 3), (2, 3) and (1, 4). The paired coxmts (1, 3), 
(14, 1), (155, 4), (1, 110), (125, 150) are plotted in Figure 1. (The 
modification of the classical angular transformation thus induced is 
discussed in Part Y (page 206).) 

If any coordinate exceeds 100, it is usually satisfactory to plot a line 
segment (vertical in the case of (155, 4) and horizontid in the case of 
(1, 110)). If both coordinates exceed 100, it is usually satisfactory to 
plot a point. 

The middle of a paired count is the center of the h 3 rpotenuse, which 
d^enerates into the center of the line s^ment. 

Percentages 

1. To convert a split into a Qieoretical percentage read the coordinates of 
its intersedion toilh the quarter circle (paying no attention to the angular 
scale). 

2. To convert a paired count into an reserved percentage, draw the split 
Arough its lower left comer and read the coordinates where the split crosses 
the quarter circle. (It is convenient to consider this an auxiliary line and 
draw it dashed). Thus (2, 47) corresponds to percentages of 4.1 and 
95.9 (Figure 1). 



BINOMIAL PBOBABILITT PAPBB 


177 



















































178 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

Measurements of detdations 

A right triangle has three comers, but we shall only be concerned 
here with the two acute-angled ones. There are two distances from a 
line to these comers of a triangle. We call these the short distance, and 
the long distance. Thus, in Figure 1, the distances from the 80-20 split 
to the triangle representing (14,1) are 4.2 mm. and 8.6 mm. The dis¬ 
tance to the middle of the hypotenuse, which is the average of the other 
two, is the middle distance, here 6.4 mm. 

In some cases there are definite reasons, theoretical or empirical, to 
prefer the short distance, the long distance or a combination of the two. 
In the remaining cases it seems plausible to use the middle distance, 
though investigation may prove otherwise. In the examples below we 
have followed this plausible procedure. 

Such distances are usually to be interpreted as nearly normal devi¬ 
ates. The scale in the upper left marked “Pull scale—Individual Stand¬ 
ard Errors” allows these distances to be read as multiples of the stand¬ 
ard deviation. 

For most purposes, however, a millimeter scale is more convenient. 
On the paper one standard deviation is almost exactly B mm. (if the paper 
has not shrunk or stretched, it is 6.080 mm.). Thus the short, middle 
and long distances of 4.2, 6.4 and 8.6 mm., from the example above, are 
0.82, 1.24 and 1.67 standard deviations. A short table of the normal 
distribution is given in Part VI, both in millimeters and in standard 
deviations. For many purposes it is only necessary to remember that 
one deviation in three should be outside —6 mm. to 6 mm. and one in 
twenty outside —10 mm. to 10 mm. purely due to sampling fluctua¬ 
tions. 

The reader who is interested in the simpler applications should now 
go to Part II (page 182). The reader who is interested in a specific ap¬ 
plication is referred to Part VI (page 209) which provides an index 
and outline. 

REFINEMENTS IN PLOTTING 
Plotting larger counts 

Paired counts up to (600, 300) can be plotted directly. If there are 
more observations, but less than (6,000, 3,000), the paired count may 
be plotted at 1/10 scale, thus (1,216, 621) would be plotted as (121.6, 
62.1) as in Figure 1. In this case 

(1) the left-hand upper scale should be replaced by the right-hand 
upper scale in converting distances into multiples of the standard 
deviation; 



BINOMIAL PROBABILITY PAPER 179 

(2) distances in millimeters should be multiplied by 3.16 before using 
Tables 1 and 2; 

(3) the triangle would extend from (121.5, 62.1) to (121.6, 62.1) and 
(121.5, 62.2) and would be completely indistinguishable'from a 
point. 

Very large counts 

If significance tests are required for still larger samples, graphical ac¬ 
curacy is insufficient, and arithmetical methods are advised. A word to 
the wise is in order here, however. Almost never does it make sense to 
use exact binomial significance tests on such data—^for the inevitable 
small deviations from the mathematical model of independence and 
constant split have piled up to such an extent that the binomial varia- 
bfiity is deeply buried and unnoticeable. Graphical treatment of such 
large samples may still be worthwhile because it brings the results more 
vividly to the eye. 

Plotting unsymmetrical counts 

Occasionally the user may have data involving large counts of one 
kind and small counts of the other, where the total coimt may not be 
fixed. Such cases may be treated by choosing a suitable divisor, and 
using the divisor only for the category with the large counts. If this is 
done, the distance from paired coimt to split should be measured at 
right angles to the contracted axis and not at right angles to the split. 
Each triangle reduces to a segment extending one scale unit away from 
the divided axis. For good accuracy, it is necessary that each small 
count be less than 5% of the corresponding total count. 

Poisson subcase 

The Poisson distribution yields the subcase of this case where the 
total number is arbitrarily large and fixed. Binomial probability paper 
may still be used without difficulty, plotting the various observations 
on any chosen, fixed vertical line, thus using an unknown horizontal 
divisor. 

Crcib addition 

In order to sum the squares of deviations in a chi-square, we may 
proceed either by measuring, squaring and adding, or by crab addition, 
using the theorem of Pythagoras. Crab addition is easily done by using 
a piece of tracing paper as follows: 

i) Mark a small circle as the origin (0) on the tracing paper and set 



180 


AMEBICAN STATISTICAL ASSOCIATION JOUBNAL, JUNE 194d 


it on the theoretical line at the point of intersection with the perpen¬ 
dicular dropped from the first sample point; 

ii) Mark a point (1) on the tracing paper at the sample point—^the 
distance between the tracing paper origin and the first sample point is 
the square root of the contribution of the first sample point to call 
it Xi*- 

iii) Line up the origin (0) and the point (1) on the tracii^ paper on 
the theoretical line with the point (1) at the base of the perpendicular 
from the second sample point to the theoretical line, and plot a point 
(2) on the tracing paper immediately over the second sample point. The 
distance from point (1) to point (2) is x* the square root of the con¬ 
tribution of the second sample point to x^, and the distance from the 
origin (0) to point (2) is Vxi+xt We continue in this manner produc¬ 
ing a picture like that shown in the lower right-hand comer of Figure 1. 

Use for chi-square 

If the individual segments submitted to crab addition were middle 
distances, then tiie final crab sum may be (geometrically) doubled in 
length and read on the marginal scale to obtain a value. (Multiply 
this answer by another factor of 10 if the points were plotted to 1/10 
scale.) 

If the individual segments submitted to crab addition were rar^es 
then the final crab sum may be read on the marginal scale and then 
(numerically) doubled in value to obtain a x^ value (see Ex. 24). (Mul¬ 
tiply by another factor of 10 if the points were plotted to 1/10 scale.) 

SIGNIFICANCE LEVELS AND SIGNIFICANCE ZONES 

The continuous case 

When a value of a continuously distributed statistic, such as Stu¬ 
dent’s %” is computed from a sample, there is no difficulty (provided 
the distribution is well tabled) in finding and statii^ its level of signifi¬ 
cance. Thus at of —0.98 on 3 degrees of freedom lies at (i) the lower 
20% point, (ii) the upper 80% point, (iii) a (balanced) two-sided 40% 
point. These statements mean that, if the rample situation against 
which we are assessing the evidence really holds, and siTnilar experi¬ 
ments are repeated, (i) an algebraically sm^er value (that is, < —0.98) 
of t will occur in 20% of the cases, (ii) an algebraically larger value 
(that is > —0.98) will occur in 80% of the cases, (iii) a value further 
from the center (that is, 1 1\ >0.98) will occur in ^% of the cases. The 
two-sided statement is apparently simple to make in the case of t, since 



BINOMIAL PBOBABHilTT PAFBB 


181 


the distribution concerned is symmetrical and continuous. In asym¬ 
metrical cases, we will follow the path of least resistance and calculate 
the two-sided significance levdi as twice the (smaller) one-sided signifi¬ 
cance level, just as 40%=2(20%). (Some statisticians might disagree.) 
The choice of one among the three significance levels in a practical 
situation will depend on the alternatives considered for the situation 
against which the evidence is being assessed. 

The discrete case 

In the binomial situation, and in the many other cases where the 
result obtained proceeds in definite steps, the situation is a little more 
complex. Consider the case of a sample of 4 from a 2-1 split. The 
probabilities of the various possible outcomes are given below to 3 deci¬ 
mal places. 


Outcome: 

(0,4) 

(1,3) 

(2,2) 

(3,1) 

(4,0) 

Frequency: 

.012 

.099 

.296 

.395 

.198 

Cumulated from 
left to right: 

.012 

.111 

.407 

.802 

1.000 

Cumulated from 
right to left: 

1.000 

.988 

.889 

.593 

.198 


What significance level shall we assign to (1, 3) in this case ? The con¬ 
ventional answers are that it lies at: (i) the lower 11.1% point, (ii) the 
upper 98.8% point, (iii) the two-sided 22.2% point. These statements 
mean that, if the situation were binomial with a 2-1 split, (i) one of the 
outcomes (0, 4) or (1, 3) occurs in 11.1% of all cases, (ii) one of the 
outcomes (1, 3), (2, 2), (3, 1), (4, 0) occurs in 98.8% of all cases, (iii) 
it is reasonable to act as though an outcome deviating from expectation 
as much or more than (1,3) in one direction or the other occurs in 22.2% 
of aU cases. 

An alternative approach, and one which supplies more information, 
is to attach to such a result as (1,3) not a single significance level, but a 
significance zone. Thus the lower significance zone for (1, 3) is “11.1% 
to 1.2%” and the two-sided significance zone is 22.2% to 2.4% (where 
we have adopted the convention of doubling in passing from one-sided 
to two-sided significance zones.) 

The statement “(1, 3) is at the lowmr (11.1%, 1.2%) zone* means 
that, if the simple binomial situation with split 2-1 holds, then (0, 4), 




























182 AMERICAN STATISTICAIj ASSOCIATION JOURNAL, JUNE 1949 

which is the only outcome further from expectation than (1, 3) in the 
same direction, will occur in 1.2% of all cases, while (0.4) and (1, 3) 
together will occur in 11.1% of all cases. 

Interpretation 

The working statistician, we believe, would almost always react dif¬ 
ferently to the statement—the outcome is at the lower (12%, 0.5%) 
zone—than to the statement—^the outcome is at the lower (12%, 
11.5%) zone. The (12%, 0.5%) zone indicates that we are not sure that 
strength of the evidence has reached the conventional 5% point, but it 
is possible that this is the case. The (12%, 11.5%) zone indicates that 
the strength of the evidence has certainly not reached the conventional 
5% point. This distinction is absent from the customary procedure of 
assessing just a significance level. 

In some cases, auxiliary experimental data could be used to interpo¬ 
late between the ends of the significance zone. 

As we shall see, binomial probability paper can be used to obtain 
both the customary significance level and the significance zone. If we 
are to determine the two percentages needed to fix a significance zone, 
we shall need to make two measurements on the paper, so it is not 
surprising that we will want to plot more than a single point to repre¬ 
sent a paired count. 

PART II—PLOTTING ONE OBSERVED QUANTITY 

A SINGLE BINOMIAL SAMPLE 

Given a sample, sorted into two categories, the numbers in the cate¬ 
gories form a paired coimt, which is plotted as a triangle. If a proba¬ 
bility or a population proportion, p, is assigned for the second category, 
then this is plotted as the g-p split, where g=l—p. The approximate 
significance level corresponds to the short distance from the split to the 
triangle, and may be obtained from Table 2 (page 211). If it is desired 
to use the significance zone, then both short and long distances should 
be measured and the probabilities obtained from Table 2. These gen¬ 
eral principles will suffice to attack the problems posed by the first four 
examples. 

Comparing an observed proportion with a theoretical proportion 

Example 1 (See Figure 2). Fisher and Mather [11] have described 
a genetical experiment in which the individuals in 32 litters of 8 
mice were observed for the characteristic of straight versus wavy 
hair. Under the conditions of the experiment, the Mendelian theory 



I S S S i 5 S S « » s « « « « 2 

’(31 xso sassaooas xrama 
io *(6 ’xa) sscAuososaa dO (s 'xa) xMssay hslioyxveo 
H xm *(8 xa) SKHva osMALO ao *(i *xa) hivh xaym. Hxm—xaaKOK 







184 AMBBICAN STATISTICAL ASSOCIATION JOUKNAL, JUNE 1940 

predicted that half the mice would have straight hair. It was observed 
that 7 ii= 139 had straight hair, n—ni=117 had wavy hair. Could such 
a discrepancy from the 128-128 split have arisen by chance? 

The obsen'ed paired count is plotted as the triangle (139,117), (139, 
118), (140,117) and is just distinguishable from a point. The theoretical 
proportion is plotted as the 128-128 split (the 50% line). The short 
distance is about 6.7 ium. = 1.3<r. Since deviations from simple Men- 
delian genetics m^ht reasonably occur in either direction, the two- 
sided significance level is appropriate, and from Table 2 this is found to 
be about 20%. Thus this test would not lead us to reject simple Men- 
delian genetics. 

Since the two-sided 5% distance is 10 millimeters, which is worth re¬ 
membering, Table 2 would not have been needed in routine testing, 
since the result “not significant at the 5% point” would have been 
enough. 

We may observe the experimental percentage of straight-haired mice 
by drawing a horizontal line to the vertical axis from the intersection 
of the degrees quarter-circle with the line from the observed point to 
the origin. This gives about 54.5 per cent as compared with an actual 
value of 54.3 per cent. 

If the significance zone were desired, then both the short distance of 
6.7 mm. and the loi^ distance of 7.5 mm. would have been measured. 
The corresponding two-sided significance interval is (20%, 14%). 

Critique of Example 1. The experimental conditions seem to have 
been exactly suited for a binomial test. The accuracy of the graphical 
method is clearly adequate. 

The eign test 

The classical case of the sign test is the comparison of two materials 
or treatments, in pairs, where the observations in each pair are com¬ 
parable except for the materials or treatments being tested. The sign 
test is a special case of the comparison of theoretical and observed pro¬ 
portions, where the theoretical proportion is always 50% and has been 
thoroughly discussed in this Journal by Dixon and Mood [4,1946]. 

Exampk 2 (See Figure 3). Dixon and Mood cite the yields of two 
lines of hybrid com where 6, 8, 2, 4, 3, 3, and 2 pairs of plots were 
available from 7 experiments. In^7 out of the 28 pairs line A yidded 
higher. 

If a significance zone is wanted, then both the short distance of 12.9 
mm. and the long distance of 14.7 mm. are measured. The resulting 
two-sided significance zone is (1.2%, 0.3%). 



BINOMIAL raOBABILlTT PAPER 


185 


Comment on Example S. The tables of Dixon and Mood correctly 
state that (7, 21) is not at the 1% level. Some statisticians, however, in 
part because of the general and unprecise considerations which lead to 
the choice of 1%, may use the extra information in the significance 
zone and decide that they would rather work at the 1.3% level than 
the 0.4% level (the precise values are 1.264% and 0.372%). They 
would then treat (7, 21) as significant at a level of approximately 1%. 

Extension of the sign test 

As presented above, the sign test applies to the hypothesis of equality 
m paired experiments. It can be easily extended to cover (i) the hy¬ 
pothesis of a constant additive difference, (ii) the hypothesis of a con¬ 
stant percentage difference, or (iii) the hypothesis of a certain popula¬ 
tion median by constructing dummy observations. Suppose that five 
experiments have produced the following numbers: 


Set 

1 

2 

3 

4 

5 

Condition A 

57 

53 

49 

56 

51 

Condition B 

26 

31 

24 

28 

31 


There is clearly no need to test for equality. A test of the h 3 q>othesis 
that condition A runs 30 units above condition B may be made by 
addii^ 30 to each observation under condition B, which yields 2 posi¬ 
tive and 3 negative differences. A test of the hypothesis that condition 
A gives numbers double those of condition B may be made by doubling 
the condition B numbers, which yields 2 positive, 1 zero, and 2 negative 
differences (the corresponding paired count is (2, 2) I). A test of the hy¬ 
pothesis that the median number in condition A is 57, may be made by 
replacing the results in condition B by an imaginary experiment always 
giving 57, which yields 1 positive, 1 zero, and 3 negative differences 
(the corresponding paired count is (1, 3)!). 

Confidence limits for fopulation 'proportions 

If we have a sample divided into two categories, we may wish to set 
confidence limits on the percentage of the population in each category. 
This is accomplished by plotting the paired count which was observed 
and then constructing two splits whose short distances from this tri¬ 
angle correspond to the two-sided level (of confidence) required. Thus 
if 95% confidence is required, short distances of 10 mm. will be used. 














186 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1940 

The coordinates of the intersections of the two splits with the quarter- 
circle give the confidence limits for percentages. 

ExampU S (See Figure 2). A random sample of 500 farms from a cer¬ 
tain region yieids the information that 143 of the farmers own outright 
the farms they work. Set 96 per cent confidence limits on the per cent 
of farms in this region wholly owned by the farmers. 

We plot the triangle (367,143), (358,143), (357,144), which can, in 
this case, be approximated by a point. We draw arcs of circles of radius 
10 mm. about its extreme vertices, and draw the tangent splits. We get 
as our estimate of the percentage of fanner-whole owners 28.5 (com¬ 
pared to 28.6 computed) and as our 96% confidence limits, 25 per cent 
and 33 per cent. 

Critique of Example S. If the sample is truly random the method is 
sound- If stratified sampling is correctly done, these confidence limits 
win be unnecessarily large by an amount which depends on the effec¬ 
tiveness of the strata chosen, for the particular question at hand. The 
increase in efficiency by stratification is often so small that these limits 
are a wise choice. 

Confidence limits for the population median 

By combining the ideas of the sign test and the last example, we may 
obtain confidence limits for the population median. If xo is the popula¬ 
tion median, and is used to divide samples into paired counts, then cer¬ 
tain extreme paired counts will occur with probability at most 1—a, 
where ct is the desired confidence. By determining these unlikely splits, 
and referring to the observations, it is possible to set confidence limits 
for the population median. While strict confidence requires taking 
blocks of paired counts with at most a certain probability, interpolation 
can usually be carried out with safety. 

Example 4 (See Figure 3). The differences between reaction-times of 
an individu^ and a control group were 6, 6, 3, 2, 1, —1, —2, —3, —6, 
—12, —13, —13, —16, —28, when expressed on a logarithmic scale 
(actually in terms of 100 times the log to the base 10). Within what 
limits do we have 90% confidence that the median of the population 
lies? 

Drawing the 60-60 split, and parallel lines at + 8.4 mm., we find that 
the triangles for (10, 4) and (4,10), where 4-+-10 = 10+4=14 (the num¬ 
ber of cases), are cut by these parallel lines about | of the way from 
their vertices nearest the 50-60 split. We interpolate f of the way from 
1 to 2 (these are the fourth and fifth values from the top and f of the 
way from —12 to —13 (these are the fourth and fifth values from the 



BINOMIAL PBOBABIUTY PAPEB 


187 


bottom) to obtain approximate 90% confidence limits of — 12f and If 
for the median of the population of differences from which the 14 ob¬ 
served differences were a random sample. 

Critique of Example 4 . The interpolation is based on the fact that the 
use of 1.001 (say) for a cutting point would give the paired count (4, 
10), as would any cut up to 1.999, while 2.001 would give (3,10). If we 
take account of the grouping and rounding process we should widen 
these interpolated values by f the grouping interval. Thus if the differ¬ 
ence were rounded from more decimals to the nearest tenth, and were 
actually 6.0, 6.0, 3.0, • • •, —28.0 then we should use — 12f —^ 
= —12.8 and If1.8. If, as might more reasonably have been the 
case, they had been rounded to integers, we should use — 12f—f 
= -13.25 and If=2.25. 

COMPABISON AND ANALTSIS OF VABIANCB 

Comparison 

The comparison of two variances calculated for samples from normal 
populations, leads alternatively to Fisher’s s or Snedecor’s F. As 
explained in Section 14, the distribution of these quantities is math¬ 
ematically related to the binomial distribution. Thus we may use 
binomial probability paper to make a significance test of the equality 
of the variances of two normal populations—^the only approximation 
being the approximation of binomial probability paper to the binomial 
distribution. 

The test is made by drawing the line or split through the point whose 
coordinates are the observed sums of squares of deviations from means, 
and plotting the point whose coordinates are h^ the numbers of 
degrees of freedom. 

Example 5 (See Figure 3). In volume 1 of Biometrika, Fawcett 
[8, 1902, p. 442] gives the sample variances of the lengths of 141 
male Eg 3 rptian skulls as 34.2740 and the variance of the lengths of 187 
female skuUs as 30.5756. Are these significantly different? 

The sums of squares of deviations are 4832.6 and 5717.6. The degrees 
of freedom are 140 and 186. Drawing the split throu^ (48.3, 57.2) 
and plotting the point (70=140/2, 93=186/2), we find a distance of 
3.7 mm which is near a (two-sided) significance levd of 45%. Thus 
there is no evidence of a difference in variance. 

Critigpjie of Example 5. The dangers in such a test He far more in the 
possible lack of normality of the populations of practical experience 
than in the approximations of binomial probabiHty paper. Such com¬ 
parisons of variances from independent samples based on normaHty 



188 


AMEKICAN STATISTICAIj ASSOCIATION JOURNAL, JUNE 
FIGURE 3 

SHOWING THE SIGN TEST (EX. 2), CONFIDEN(2B LIMITS FOR MEDUN (EX. 4), THE 
F TEST (EX. 6), OPERATING CHARACTERISTIC OF SIGN TEST (EX. 10) SAMPLE SIZE FOR 
POPULATION TOLERANCE INTERVALS (EX. 11) 


Individual Standard Errors 



NUMBER—OF PAIRS WITH LINE A HIGHER (EX. 2, 10). OP CASES BELOW POPULATION 
MEDIAN (EX. 4), OF DUMMY FAILURES (EX. 11) 





BINOMIAL PBOBABIUTT PAPEB 189 

should almost alwairs be taken with a grain of salt, particularly when 
so many degrees of freedom are involved. 

This procedure is quite convenient when both degrees of freedom 
are greater than 24 (this is outside the range of short tables of F or z) 
and will be quite accurate if both degrees of freedom are at least 2. 
When one of the degrees of freedom is 1, accuracy will be improved by 
plotting § seven-tenths of the way from zero to one, because 
The use of binomial probability paper has the feature, interesting to 
some, of producing estimates of intermediate levels of significance. 
If it is desired to check the tabled values of levels of F against the 
results given by binomial probability paper, one plots the split (»iF, w*) 
and measures the distance to the point (ni/2, ni/2), remembering that 
the tabled values provide a one-sided test. 

Analysis 

The same process can, of couise, be applied to the amdysis of vari¬ 
ance, in principle a special case of the comparison of variances, hut in 
practice a whole realm of its own. The sum of squares column deter¬ 
mines the line and the df column the point. 

Example 6 (See Figure 2). How can the analysis of variance in 
Example 7 below be tested graphically? 

Draw the line through (257.15, 73.16) and plot the point (1, 5). 
The perpendicular distance of 15 mm is very highly significant (be¬ 
yond the 0.1 per cent point). 

Critique of Example 6. The difficulties with normality, mentioned in 
the last critique, are usually reduced enough to be neglected in an 
analysis of variance situation. Only in case this graphical method yidds 
borderline significance is accuracy an excuse for using an F or z table, 
thou^ convenience may often be a reason. 

The angular transformation 

The analysis of variance of counted data is frequently facilitated 
[3, Bartlett 1947 and references cited therein] by making the angular 
transformation. If the data involve small numbers, the accuracy of 
graphical transformation will suffice. The observation is plotted, a line 
throu^ the origin produced to the quarteiv-cirde, and the correspond¬ 
ing an^e read off. 

Example 7. W. E. Eappauf and W. N. Smith (personal communica¬ 
tion) tested the performance of six observers in reading three tjrpes of 
dials. Sixty readings were made by each observer on each size. The 



190 


AMBBICAIT STATISTICAIi ASSOCIATION JOTTBNAL, JUNE 1919 

errors, angles, and analysis of varianee are shoivn bdow. (The reader 
wiU find it easy to check on the computation of the angles on his o-vm 
piece of graph paper.) 



Source 

df 

Sum of squares 

Mean'square 

Observers 

5 

991.64 

198 

Sizes 

2 ; 

257.15 

129 

Interaction 

10 

73.16 

7.3 

Binomial 

— 

— 

^=13.7* 

60 


* Varianoe of an aaj^e obtained from Innomially distributed data 7 -* Czadians)*^— (degrees)*. 

4n n 

The usual test for significance of the effect of size would be F=129/7.3 
= 17,8 on 2 and 10 degrees of freedom, which is highly significant. 

Critique of Example 7. A possibly more conservative test would be 
jF = 129/13.7=9.4 on 2 and oo degrees of freedom, which is still very 
highly significant. Although 7.3 on 10 df is not significantly less than 

13.7, there is reason to believe in this case that the error mean square 
in a large-scale repetition of this experiment might well be less than 

13.7. For the analysis above assumes the errors distributed binomially, 
and the probable differences in difficulty of the various dials attempted 
might reduce this variance. 


















BINOMIAL PBOBABIUTT FAFEB 


191 


PART III. APPLICATIONS TO DESIGN 

BINOMIAL DESIGN 

Since binomial probability paper allows us to approximately judge the 
significance of a paired count, it must also let us plan binomi^d ex¬ 
periments to have desired properties. 

Sample size necessary to resolve two given percentages 

When designing a test to discriminate between two theoretical 
percentages, the experimenter often wishes to know that any result 
will give significant evidence against at least one of the two theories. 
The procedure is best described by an example. 

Example 8 (See Figure 2). A geneticist wishes to test whether a 
certain character appears in one-half or one-quarter of the progeny of 
a certain mating. He requires significance at the (two-sided) 1 per cent 
level against at least one of these hypotheses and wishes to know the 
smallest sample size which will guarantee this. He draws the 50-50 
and 25-75 splits and then parallel lines at a distance of 13.1 mm 
(2.58<r) whidx corresponds to the two-sided 1% level. These parallel 
lines intersect at (37, 63). This point separates the triangle (36,63), 
(37, 63), (36, 64) from the triangle (37,62), (38, 62), (37,63). Thus the 
paired coimt (36, 63) is beyond the 1% levd from the 50-50 split, 
while (37, 62) is beyond the 1% level from 25-75. Thus a sample size 
of 36-1-63 =37-1-62 =99 (=37-1-63-1) will be enough. 

Critique of Example 8. The design of this experiment is very good as 
far as sample size and significance levels are concerned. Since such 
genetical ratios are usually well-behaved, the experimenter who uses 
a sample of at least 99 and protects the progeny against causes of 
differential mortadity should obtain very good results. 

Our criticism should be directed against his less careful competitor 
who says that he wiU use the two-sided 1% level also, but will be 
satisfied with a sample size for which an observed 75-25 proportion 
will be at this level. Since the paralld line for 50-50 cuts the 25-75 
split at (6, 18), he will use a sample size of only 5-f 18=6-1-17=23. 
This design probably does not meet his needs, for if he uses it over and 
over he will be well protected from falsdy stating the ratio is not 50-50 
(since he is using a two-sided 1% level) but he will miss one-half of Ae 
cases where the ratio is 25-75. Such designs emphasizing one risk are 
usually ill-chosen, and if such a choice is compelled by limited ex¬ 
perimental resources the choice of significance levd should be re¬ 
examined, lookix^ at both t 3 rpes of risk. 



192 AMBBICAN STATISTICAL ASSOCIATION JOUBNAL, JT7NIS 1949 

Designing single sampling plans 

An essentially equivalent problem arising in industrial work as 
well as in certain kinds of experimental work is to design a sampling 
inspection plan which will distinguish between two kinds of quality, 
say product which is lOOpi per cent defective and that which is lOOp* 
per cent defective (pi<pa). The plan desired is often described in 
terms of the operating characteristic curve, namely that large lots 
having lOOpi per cent defectives should be accepted lOO(l-a) per cent 
of the time, while lots having lOOpi per cent defectives should only be 
accepted 100/5 per cent of the time (l-a>jS). The process of building 
such a plan is described by the following: 

Example 9 (See Figure 2). It is desired to construct a plan which 
win accept product with 3 per cent defectives, 95 per cent of the time, 
while product 12 per cent defective is to be accepted only 10 per cent 
of the time. What sample size should be used, and how many defectives 
can be tolerated before the lot is rejected? 

We construct the 3% line (the 97-3 split) and the 12% line (the 
88-12 split). If we wish to accept lots which are 3 per cent defective 
95 per cent of the time, we must accept lots whose samples, as plotted 
on the paper, go as high as 1.64(r above the 3 per cent line, consequently 
we draw a parallel line 1.64(r=8.4 mm above the 3% line. Similarly 
in order to reject material which is 12 per cent defective 90 per cent of 
the time, we must reject lots whose samples come within 1.28(r of the 
12% line, so we draw a paraUel line 1.28(r=6.5 mm below the 12% 
line. The intersection of the two construction lines is (61, 5) and we 
find that the sample size is 61-(-5—1=65, the acceptance number is 4, 
the rejection number is 5. 

Critique of Example 9. It is assumed here that the lot is much larger 
than the sample, say ten times as large. Notice that these significance 
levels (10% and 95%) were one-sided. Consultation of tables shows 
that this plan will accept 3%-defective lots 96.77% of the time and 
will accept 12%-defective lots 9.69% of the time, which is approxi¬ 
mately the result requested. 

The operating characteristic of the sign test 

We often want to know how well the sign test will discriminate. 

Example 10 (See Figure 3). In Example 2 we considered 28 pairs of 
observations and decided to treat (7, 21) as significant. It is natural to 
inquire what population percentage of favorable pairs is needed to 
insure significance at this levd 95% of the time. We must then find a 



BINOMIAL PROBABILITY PAPER 


193 


split so that the triangle representing (7, 21) is 8.4 inm=1.65<r away. 
This leads to the 12-88 split, and so the sign test with 28 pairs dis¬ 
criminates very well between 50% and 12% (or 88%). 

As in Examples 8 and 9, we can determine a sample size so that the 
sign test will have given discnminating power. 

TOLERANCE LIMIT DESIGN 

Sample sizes for populatiori tolerance limits 

In industrial work it may be desirable to take the least sample from 
an unknown population, such that the range from the smallest value 
in the sample to the largest value in the sample will cover a givai 
fraction a of the population with given confidence jS. This may be 
shown to be equivalent to a binomial problem, namely: Find the least 
sample size from a q-p split (p = l-q) such that the second count will 
be at least 2 with confidence /9. (This is, of course, a special case of 
Example 8.) 

If it is desired to use the rth from the bottom and the mth from 
the top to establish tolerance limits, replace 2 by r+m. 

Example 11 (See Figure 3). A manufacturer of ball bearings wishes 
to have 99.6% confidence that 90% of his ball bearings lie between the 
limits set by the largest and smallest of a sample of a chosen size. He 
draws the 10% line (10% = 100%-90%!), a paraUel line 2.58v=13.2 
mm lower (for 99.5% confidence) and the horizontal line through 2. 
The intersection of the last two lines gives (71, 2) and the desired 
sample size is 74= 72+2. 

Critique of Example 11. This example assumes that the successive 
ball bearings produced behave like a random sample—^no manufac¬ 
turer of any metal object, even ball bearings has reached so high a 
state of control. The practical interpretation of such a sample size is 
that it is a lower boimd. 

Second sample tolerance limits 

When tolerance limits are desired for a second sample, rather than 
for the population, the problem is hypergeometric, and may only be 
approximated by a binomial problem. The chance that the range be¬ 
tween the rth from the bottom and the mth from the top from a 
sample of n will omit Ni or less from a second sample of N, may be 
shown to be the same as that a sample of Ni+ni (where «i=r+m) 
from a finite population split N to n will contain ni or more of the 
second sort. If Ni^N/10, this can be approximated by the chance 



194 


AMEBICAN STATISTICAL ASSOCIATION JOURNAL, JUNB 1919 


of a second count of ni or more in a sample of Ni+ni from a N-n split. 

Example IS (See Figure 2). A manufacturer of precision resistors 
tests random samples of 1000 of each new type, and establishes the 
second from each end as working limits. He wishes to know the con¬ 
fidence with which he may expect (a) 99.5 or (b) 99 per cait of a batch 
of 50,000 similar resistors to fall within these working limits. He draws 
the 50,000-1,000 split and computes the chances of getting a second 
count of 4 or more in a sample of (a) 250-1-4, (b) 500-1-4. These are 
giv^ by the distances from the split to (250, 5) and (500,5), which are 
0.0 mm»50.0% and 9.3 mm»3.4%. He has, therefore, 50.0% confi¬ 
dence that 99.5% will be within working limits and 96.6% confidence 
that 99% will be within working limits. 

In the light of this example, the reader can construct answers to 
the other problems of tolerance limits for a second sample. 

Critique of Example IS. If the manufacturer’s production line is 
nearly in control when it starts to produce a new type, the authors 
will be surprised. The procedure given will answer the manufacturer 
who “wishes limits to the confidence, PROVIDED the process were in 
perfect control from the start.” Also note that m and r have to be 
selected before the sample is examined, if the probabilities are to be 
accurate. 


ANALYSIS OF VABIANCE DESIGN 

Operating characteristic of model II—anova 

An analysis of variance situation is Model II [6, Eisenhart 1947], 
when the effects are drawn from a normal population with variance 
ffi*, the errors being drawn from a normal population with variance 
0 ^. If the effects are the c column effects in a simple design with c 
rows and r columns, then the mean squares for columns and for error 
have the expectations d^+ra^ and and the d^rees of freedom 
(o-l) and (c-l) (r-1). In Model II, the mean squares are still distributed 
like multiples of chi-square. Thus the power of such an experiment can 
be easily determined as in the following example. 

Example IS (See Figure 5). A random sample of 25 s^ors are to have 
their balancing ability measured quantitatively on each of 7 days. The 
results will be submitted to the analysis of variance, and a 5% signifi¬ 
cance level used. How large must <r^/a^ be, before the existence of 
dififerences between sailors will be detected 95% of the time? 

Here c=25, r—7, and the d^rees of freedom are 24 and 144, hence 
we plot the point (12, 72). A one-sided 5% level corresponds to 8.4 



BINOMIAIi PBOBABIIilTT PAPER 


195 


mm, so we draw a circle around (12, 72) with this radius. The tangent 
splits are 27.2-100 and 8.9-100, so that the critical variance ratio is 
found from 

-I- 7ff* 27.2 

7* sF 


to be <riV<»®=-29. 

The basis of this construction is as follows: A ratio of sums of squares 
of 27.2 to 100 is needed for the chosen significance level. A ratio as 
small as 8.9 (<j®+7<ri®) to lOOo^ can be expected by chance 5% of the 
time if the population ratio is «r*-4-7<ri* to <r®, hence to obtain 95% con¬ 
fidence of finding significance at 5%, we must have 

8.9((r* -1- 7«ri*) ^ 27.2 

100<r* “loo ’ 

PART IV—SEVERAL PAIRED COUNTS 

LEVEL AND H0M06ENEITT COMBINED 

Comparing several sets of data with a theoreticai proportion 

A set of several paired counts may give evidence against a fixed 
theoretical proportion in two ways. If the observed proportions are 
too variable, they indicate lack of homogeneity, that the samples came 
from populations with different proportions. If the average observed 
proportion deviates too much from the theoretical, it indicates a change 
(or error) in level, that the theoretical proportion does not apply. 

It is well known that the correct way to test a homogeneous set of 
paired counts for agreement with a preassigned population proportion 
is to add them together, and then test the sum (as in Part II). Tests of 
homogeneity alone are the subject of the next section. Many delicate 
testing procedures involve first a test of homogeneity and then a test 
of level. Combined tests are mainly used to make quick and easy tests. 

Example 14 (See Figure 4). A production process has been produciig 
an average of 15% defective pieces over a long period. Aft^ the intro¬ 
duction of a new batch of raw material, successive shifts produced the 
following paired counts of nondefectives and defectives: 

1. (155, 20), 2. (164, 41), 3. (106, 12), 4. (41, 10). 

Is it reasonable to think that the production process is producing the 
same proportion of defectives as before? We plot the points and the 



PiaUKE 4 

SHOWING COMPARISON OF K PROPORTIONS WITH THEORY (EX 12. EX 16), FIRST STEP TOWARD A 
STABIDIZED p CHART (EX 16), HOMOGENEITY OF K PROPORTIONS (EX 17. EX 18 EX 19) 



(6T (INV 81 Xa) 'IVWRON '(ZT XS) 

sonoHooi>NON ao ‘(9t qnv si onv fi xsO aAixoaiaa—aaamiN 











IntflvHliial SttiMhurd Errors Tantsi tmto 


binomial phobability paper 


197 



(mz ‘XSD SHaHOFISNON JO 

*(yTS ‘XS) SaiHVNOHOD AO *iOZ xa; NOlXONOfSiaNON 

Auvwraj ONiMOHS '(fii *xa; (fx; woaaaaa ao ssGraoacr AO-aaapmN 


125 160 176 200 226 250 276 800 326 360 376 40 










198 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


15% line. Since no middle distance from the line is as great as 2o’=10 
mm, and since the points are about equally spread about the line, there 
seems to be no reason to suppose the new batch of raw material has 
made a change in the percentage defective. 

If some points were found outside 2<r the resulting paired count 
(of points inside 2cr vs. points outside 2(r) would be compared with the 
19-1 split by the method of Example 2. 

An alternative procedure is to combine the middle distances from 
the theoretical line by crab addition (on page 179) and consider this 
a chi-square with as many degrees of freedom as there are paired 
counts. 

Example 15 (See Figure 4). Taking the same data as in Example 14, 
the crab sum, which can be obtained in less time than it takes to read 
the description, is a length, which when doubled may be read on the 
marginal scale as x®=9-0 which is between the upper 10% and upper 
5% points for 4 degrees of freedom. 

Critique of Examples H and 15. Merely examining to see whether 
any points fall outside the 2<r limits does not squeeze the data dry, and 
is not very precise, but is very convenient in situations where a control 
chart would seem reasonable and proper. Both tests tend to detect 
either a change to a new level or excess variability due to changes in 
level from shift to shift. To check one of these alone, different proce¬ 
dures should be used. A change to a new level should be tested for by 
combining all the shifts, and thus comparing (466, 83) with 86-16 
(which is far from significance). Changes in level from shift to shift 
should be tested for by the methods of the next examples (Ex. 17-19). 

i 

The stabilized p~chart 

Where the lot size is constant in 100% inspected industrial production 
or the sample size is constant when sampling inspection is used, one 
of the standard quality control procedures is the p-chart, where the 
percentage of each lot or sample found to be defective is plotted against 
lot number or time. 

When lot size is not constant, and it is not feasible to break down 
the data into groups of different size, there is a need for a new tech¬ 
nique. The use of groups of varying size is not recommended—^it 
would be better to break up the lots into rational sub-lots of nearly 
uniform size—but where the quality control engineer cannot arrange 
for the better solution he may wish to use the following device which 
was suggested to us by Acheson Duncan [6]. (The classical device is 



binomial probability paper 199 

to plot observed percentages, with broken horizontal lines for control 
limits. This makes a hard-to-read, messy diagram.) 

Example 16 (See Figures 4 and 6). The following data on adjustment 
irregularities of electrical apparatus appear in the ASTM Manual on 
Presentation of Data [1, 1940 Supplement B, p. 58]. The first number 
given is the sample size, the second, the number of defectives. (We 
shall not correct the sample size by subtracting the number of defec¬ 
tives, because in the one case where this might make a visible difference 
there are no defectives.) We divide the sample size by 10, and plot 
the triangles in Figure 4 (to this scale, the triangles reduce to vertical 
segments). 


ADJUSTMENT IRREGULARITIES, ELECTRICAL APPARATUS 


Lot 

Sample Size 

Defectives 

Lot 

Sample Size 

Defectives 

1 

600 

2 

11 

1550 

7 

2 

mmm 

2 

12 

950 

2 

3 


1 

13 

950 

5 

4 


1 

14 

950 

2 

5 


5 

15 

35 

0 

6 

2000 

2 

16 

330 

3 

7 


0 

17 

200 

0 

8 

780 

3 

18 

600 

4 

9 


0 

19 

1300 

8 

10 

■■ 

15 

20 i 

780 

4 


From Figure 4 we measure the vertical deviations from the p line 
(which is assumed to be 0.27% based on past experience) and plot 
them on a regular control chart (Figure 6), being sure to keep the data 
in the order in which they originally appeared. (The use of tracing 
paper makes this process very easy.) In practice each new observation 
would first be plotted on binomial probability paper (perhaps at an 
enlarged scale) and then transferred. If the data are retained on the 
original probability paper, the advantage of examining the data for 
trends and runs would be lost. 

Critique of Example 16. The control chart in Figure 6 looks very 
different from that usually given. In the usual chart Lot 19 is shown 
as beyond the control limits on the high side, and Lots 4 and 7 are not 
detected as being possibly too defective-free because we find that there 
can be no lower control limit. This kind of plotting m^ht be useful 
even when the samples are of constant size. 
















DEVIATIONS 


200 


AMEBICAN STATISTICAL ASSOCUTION JOUBNAL, JUNE 1949 


TESTS OF HOMOGENEITY 

The general case 

Given 5 or 50 or 500 paired counts which we wish to test for homo¬ 
geneity, to test if it is reasonable that they have arisen from sampling 

FiaUBE 6 

SHOWING COMPLETED, STABILIZED P-CEABT AT ENLABGED SCALE 
(WITH SEGMENTS MADE VEBTICAL). 



a population with the same percentaps in the two categories, the 
problem is the same, but the practical solution is different. In every 
case we plot the individual paired counts and draw the best fitting 
split, either by eye or through the sum of the paired counts. 

We shall discuss three methods here, namely: 

(1) graphical chi-square, 

(2) range, 

(3) coimts in ± Iff and ± 2®- strips. 




BINOMIAL PBOBABILITT PAPER 201 


Each of these has advaatages and disadvantages, and, to the best of 
our knowledge, they can be compared as follows: 


Method 

Number of Samples 

Advantages 

Disadvantages 

Feasible 

Recommended 

X* 

range 

counts 

2 to 00 

2 to ? 

15 to 00 

2 to 8 or 15 

2 to 20 

15 or 20 to 00 

efficiency 
ease and speed 

80% efficiency; rela¬ 
tive simplicity 

labor 

limited efficiency 


The range is only recommended for 20 paired counts or less since its 
use for lai^er k involves the delicate details of the normal distribution 
and since its efficiency is less than the counting method. 

To apply the x^test, plot the paired counts and the split through 
their sum, and combine the middle distances by crab addition as 
explained at the end of Section 4. 

Example 17 (See Figure 4). The following classic data by C. Goring 
quoted by K. Pearson and by M. G. Kendall compare the number of 
alcoholics and non-alcoholics among criminals according to crimes 
committed. The first number in each pair is the number of alcoholics: 


Arson 

(50, 43) 

Stealing 

(379, 300) 

Rape 

(88, 62) 

Coining 

(18,14) 

Violence 

(155, 110) 

Fraud 

(63, 144) 


Totals 

(753, 673) 


The graphical display shows very clearly that (1) the observations are 
discrepant, (2) the crime of fraud is the only one for which the propor¬ 
tion of alcoholism is really different. Graphical chi-square computation 
gives 30.2 on 5 degrees of freedom—^highly significant. When criminals 
convicted for fraud are removed from consideration the remaining 
five groups are each less than one standard deviation away from the 
new fitted line (690-539 split). Stealing is slightly misplotted. 

Critique of Example 17. If the definition of “alcoholic* were suffi¬ 
ciently objective, and if the sample of convicted criminals represents 
a random sample of criminals, then the analysis seems sound. It can¬ 
not, of course, throw any appreciable light on the connection between 
alcoholism and crime in general, bearing only on the question “exclud- 













202 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


ing fraud, do alcoholics tead to be convicted of some types of crime 
and non-alcoholics of others?” 

A quicker method of analysis, and one wdl suited for drawing lines 
by eye is to compute the range of the sample, that is the sum of the 
greatest middle Stances to the right and left of the line. Then the 
range measured in millimeters or standard deviations can be compared 
with Table 4 (at end of paper) to discover whether the samples deviate 
enough among themselves to provide evidence that the observations did 
not arise from random sampling from a single proportion in the 
population. 

Example 18 (See Figure 4). In testing the effect of the X-chromosome 
inversion on secondary non-disjunction, K. W. Cooper (personal 
communication) raised 21 cultures of Drosophila melanogast&r, crossing 
v-lD.{l)B^'^/yhirh)/Y females with wild-t 3 q)e Canton-S males. The 
presence or absence of secondary non-disjimction can be detected in 
female progeny. The 21 cultures gave the following results: 

(9,135), (7,115), (11,118), (13,89), (15,148), (8,91), (6,113), 

(11,104), (9,122), (10,90), (15,155), (14,138), (5,84), (11,128), 

(2,34), (4,73), (4,107), (9,107), (10,103), (11,115), (8,104) 

where the smaller class showed secondary non-disjunction. The split 
is drawn through the total of (192,2273). The range is 16.5 mm which 
is far from significant—^thus this test gives no evidence of heterogeneity. 

Critique of Example 18. Clearly an amount of heterogendty large 
enough to affect the estimated standard error of the grand mean would 
abnost surely have been detected. The test seems adequate for its 
purpose. 

A more refined, but still simple test is obtained by drawing parallel 
lines, at ± 5 mm and ± 10 mm. In sets of 21 or more samples, we expect 
about 5% outside the outer lines and about 33% outside the inner 
lines. The weighted sum 

12 (no. outside 10 mm) -f- 3 (no. between 5 mm and 10 mm) 

—2 (no. inside 5 mm) 

is distributed with mean nearly zero and variance nearly 11.72% in 
ease of k homogeneous samples. Approximate significance levels are 
5.66\/% and 8VX for the 5% and 1% points. 

Example 19 (See Figure 4). Betumii^ to the data of Example 18 
and drawing the 5 mm and 10 mm lines, we find (classifying borderline 
cases according to the center of the hypotenuse of the triangles). 



BINOMliUi PBOBABIUTT FABEB 


203 



Expected 

Found 

Outside 

1 

0 

Between 

6 ^ 

4 

Inside 

14 \ / 

17 


The weighted sum is 12-34= —22 which is far from the significauce 
levels of 25.9 and 36.5. This more delicate test finds no evidence of 
heterogeneity in the per cent of secondary non-disjunction in Cooper’s 
cultures. Indeed, if anything the data are a little too homogeneous, 
though not enough to notice. These methods can also be applied to the 
case of uns 3 rmmetric counts, as in the following example. 

Example 20 (See Figure 5). In testing the effects of X-chromosome 
inversions on primary non-disjunction, K W. Cooper (personal 
communication) crossed 847 males with females of eight different 
chromosomal sequences. Exceptional cases can be detected in both 
males and females. The observed coimts for males were (2885, 13), 
(7172,18), (4672,13), (9162,14), (1389,2), (2961,4), (2199,2), (1195,1). 
Does the rate of primary non-disjunction seem to be constant? 

The total count is (31635,67), and the corresponding split together 
with the individual cormts are plotted on Figure 5, where all horizontal 
coordinates have been divided by 60. The lines parallel to the total split 
are at ±5 mm and ±10 mm vertically. The range, measured vertically 
(since the horizontal coordinate has been reduced), is 19.3 mm which 
is not far from the 5% point of 21.8 mm. 

Critique of Example 20. The method of Example 18b should not be 
applied on so few points, but if it were applied the wei^ted sum would 
be 12(l)-l-3(3)—2(4) = 13. The value of 5.6511; is 16.0, but direct cal¬ 
culation of ^e 5% point yields 19. Thus the approximate 5% point 
is not too accm'ate for 8 points (16.0 is about the 10% point). Calcula¬ 
tion of by crab addition of the verticcd deviations from the line to 
the total yidds 13.9, which is ^ain not quite at the 5% levd for 7 
degrees of freedom. 

The foar-jold table 

The four-fold table, where a sample is classified into two cat^ories 
in each of two ways has received very much attention by both applied 
and theoretical statisticians. Different methods of analysis have been 
given, some of which assume that 

(1) the sample is a representative of samples in which only the total 
is fixed, or 



204 


AMEEICAN STATISTICAL ASSOCIATION JOXTRNAL, JUNE 1949 

(2) the sample is a representative of samples in which one set of 
marginal totals are fi^ced, or 

(3) the sample is representative of samples in which all marginal 
totals are fixed. 

Many of the "control group versus experimental group” experiments 
so common in biology, medicine, psychology, and education fall under 
(2), since the numbers in the control and experimental groups are 
fixed. Such experiments can be approximately analyzed as a homo¬ 
geneity test as in the last section. For the case of two paired counts, the 
chi-square and range methods are equivalent, and the range is simpler. 

Example SI (See Figure 6). English et al. [7,1940] took samples of 
208 smokers and 208 non-smokers and investigated the incidence of 
coronary disease. They found (198,10) and (206, 2), where coronaries 
are the second category. The range is 17 mm which is significant at the 
5% level. 

They also took 187 cases with coronaiy disease, and 302 without, 
and investigated the incidence of smoking. They found (149,38) and 
(115,187), where smokers are the first category, which yields a range 
of 30.1 mm (Figure 5) which is horribly significant. 

Critique of Example SI. These last two samples can be united into a 
four-fold table, but, in view of the way in which the data were obtained. 

Coronary disease 

Yes No 

Smokers 149 187 

Non-smokers 38 115 

it would be tncorred^ to compare (187,149) with (116,38) by this method 
and to assume that two binomials were being compared. However, the 
range obtained in this way is 29.7 mm and it is possible that such 
inverted tests on binomial probability paper give approximately cor¬ 
rect answers. 

A less obvious example 

The ideas behind the sign test may be extended to give approximate 
tests in many situations of greater complexity. Such tests may be very 
useful, when used with the knowledge that they are quick, but often 
lack the sensitivity of more complex methods. 

Example SS (See Figures 7 and 5). A routine bioassay had been in 



AN SURVIVAL TIME OP FISH IN HOU! 


BINOMIAL PROBABILITY PAPER 


205 


use for two years using a standard curve. Occasional checks on the 
standard had been made. The situation is shown in Figure 7, which 
raised two questions: (i) Does the curve agree with the recent points? 
(ii) If not, has something surely changed in two years, or may the 
difference be assigned to the combined sampling fluctuations in estab¬ 
lishing and checking the standard curve? 

FIGURE 7 

BASIC BTOASSAY DATA FOR EXAMPLE 22 (A LESS OBVIOUS EXAMPLE). 

24 

CURVE FITTED BY LEAST SQUARES TO 18 POINTS, 

TWO YEARS OLD 


18 

15 

12 

9 

6 

RECENT CHECK POINTS 


DOSE (GAMMAS) 

The first question is answered by the split test, for comparing (19,6) 
with a 60-50 split yields (Figure 5) a separation of 16 mm which is 
very highly significant. 

The second question can be approximately answered as follows: the 
original least square fit to 18 points was probably more accurate than 
fitting a median to 18 points and less accurate than fitting a median to 
86 points. A roughly fair test should come between a comparison of 
(19, 5) and (9,9) and a comparison of (19, 5) with (18,18). These give 
(Figure 5) ranges of 14.5 mm and 16,5 mm, which are both beyond the 



206 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

6% level, indicating that the activity, or the fish, or the technique 
has probably changed. 

Critique of Example &2. While these are not the most thorough tests 
which can be applied to this situation, anyone familiar with bioassay 
computation wiU appreciate their speed, simplicity, and clarity. 

PART V—REFERENCE MATERIAL 

MOnmCATION OF THE ANGULAR TRANSFORMATION 

The original angular transformation 

The angular transformation was introduced by R. A. Fisher in 1922 
[9, p. 326] in a genetic situation where a certain proportion was varying 
by random fluctuation from g^eration to generation. In 1936, Bartlett 
[2, p. 74] proposed its use on experimental data as a means of stabilizing 
the variance rrhen binomial data were subjected to the analysis of 
variance. Various authors have proposed its use for various purposes, 
a considerable number of references may be found in [13, 1947]. 

Bartleys modification 

In his 1936 paper, Bartlett also proposed an empiricfd modification 
to make the transformation more effective near p=0 and p=l. This 
was the device of transferring | of a imit from the larger count to the 
smaller count. Thus (3,29) would become (3.5, 28.5). This proved to be 
helpful, but had the annoying feature tiiat both (3, 4) and (4, 3) were 
converted to (3.5, 3.5) which did not seem appropriate. 

The smooth version 

The smooth way of obtaining the good effects of near the ends 
and ±0 in the middle is to add i to each cell, thus passing from 
(n-k, k) to (n-fc+l, k+^). It is clear that for values of p near 0, and 
1 this will stabilize the variance very welL How well requires a numeri¬ 
cal study, now in progress. 

Correction for continuity 

Most of the applications of binomi^ probability paper discussed 
above deal with tests of significance rather than with scoring paired 
counts. We must try, then, to assign nearly nomud deviates, not to 
sin^e paired counts, but to tails—^to all (n-k, k) for which k^r, 
for example. This is closely connected with the scoring problem, since 
a natural dividing line betweffli {n-r+1, r-l) and (n-r, r) is (n-r-bj. 



BmOmAIi PBOBABIUTT PAPER 207 

7 ^^), and in accordance with the last parf^raph, this is to be scored as 
if it were (ji-r+l, r). Thus we expect to find that 

?n + l^sin"^ — sin~* Vp^ 

is nearly the normal deviate associated with the probability that 
k^r, where k is binomially distributed according to n and p. 

Flattening 

Since the aisles involved are rather small, it is plausible to replace 
them by their sines. This is of course what has been done in the ex¬ 
amples, where we have always measured distances perpendicular to 
the splits. A little trigonometry shows that the distance from (>^■r+l, 
r) to the p-q split is (ia standard deviations) 

2{Vp(n - f + 1) - 

(To obtain distances in millimeters, replace 2 by 10.16 mm.) 

Accuracy 

The accuracy of the over-all approximation to 

Pr {A; ^ r I A; binomial (», p)} 
by 

Pr {x ^ 2(Vp(7r—T+~l)’ — VgrI a: unit normal} 

has been studied numerically, and a note giving details wfil be sub¬ 
mitted to the Annals of Mathematical Statistics (by Murray F. Free¬ 
man and John W. Tukey). The general conclusion is that the approxi¬ 
mation is extraordinarily good near the 1% to 5% points, and re¬ 
markably good in general. 


THE INCO&IPLETE BETA AND F DISTBIBUTIONS 

The binomial distribution is, as is wdl known, given by the expansion 
of 


(g -H p)" g = 1 - p 

where n is the number of cases, p the chance of a "success” and the 
term 




208 


AKIEBICAN STATISTICAL ASSOCIATION JOtHEtNAL, JUNE 1949 


in the expansion is the probability of exactly r successes. The probabil¬ 
ity of r or more successes is given by 




n\ 

x\(n — x)\ 


(1 - p)*~‘p‘. 


Using the well known device of differentiating both sides with respect 
to p and summing, we get 


dp 


n! 

(r — l)!(n — r)! 




- p)"-'. 


Replacing p by f and integrating 8 from 0 to S, and t from 0 to p, we 
have the usual relation 


Pr (r or more successes) = Ip(r, — r + 1) 

where /p(m, n) is the incomplete Beta-function. Hence if binomial 
probability paper successfully represents the binomial distribution it 
also successfully approximates the incomplete Betarfunction. 

Thus 


I,(x, » - r + 1) 

— Pr ^ 2(Vp(« — r -f 1) — V^) I X unit normal) 

which seems, incidentally, to be a new analytic approximation to 
the incomplete Betarfunction. Simplifying notation, we find that 
Ip(mij m 2 ) corresponds to the distance from p-g split to the point 
(wi, 7W2). 

The ratio of two independent mean squares obtained from normal 
variates of the same variance is Snedecor's P, which is related to 
Fisheris z by The ratio of the numerator sum of squares to the 

total sum of squares may be written in terms of F as 

fiiF 

X = - } 

Th 4* niF 

and its distribution is given by, 

Pr {a; < p} = Ipdni, ^712). 



BINOMIAL PBOBABILITT PAPBB 209 

Hence, to the approximation of binomial probability paper, a ratio 


Si 

—_ = p 

Si + Sj 

of sums of squares has a probability of arising from populations of 
equal variance which is given by the tail area corresponding to the 
deviation of the count (^i, §nj) from the line p—(f-p) which is the 
same as the line Si—s*. 

PART VI—INDE3X, OUTLINE AND TABLES 
Introduction 

Table 1 is not intended to replace the worked examples, but rather 
to serve as a key for the new reader and a reminder for the old. 

The short tables which follow are of standard distributions based on 
the normal distributions. Since millimeters are convenient units for 
use with binomial probability paper, Ihey are given in both milli¬ 
meters and in standard deviation units. 

For maximum accuracy, use a sharp pencil! (Regular thickness auto¬ 
matic pencils may serve for some routine work, but finer lead will give 
better results.) The figures have been drawn for clarity, not accuracy. 


Remember these meOiods are aU approximations. 

TABLE 1 

INDEX AND OUTLINE 


Ezample 

Aim 

Plotting Required 

P^mftrlrQ 

Part II. PlotUng one observed quantity 

1 

p. 182 

Observed and theoietical 
proportions 

1 paired count 

1 split (theory) 

Use diort distance for sig¬ 
nificance level. Use both 
short and long for signifi¬ 
cance sone. 

2 

p.184 

Sign test 

1 spht (50-50) 

1 paired count 

3 

p. 186 

Confidence limits for pro* 
portion 

1 i>aired count 

2 splits (at distance) 

Use short distance 

4 

p. 186 

Confidence limits for me¬ 
dian 

1 spHt (50-60) 

2 paired counts (at distance) 

Use short distance 

5. 6 

p.187,189 

All P tests 

1 split (sums of squares) 

1 point (} degrees of free^m) 


7 

p. 189 

Angular transformation 

1 paired count 

1 split (through middle) 




























210 


AMERICAN STATISTICAL ASSOCUTION JOURNAL, JUNE 1949 


TABLE I (Centintted) 


Example 

Aim 

Plotting Bequired 

Remarks 

Part III. Application to design 

8 

p.191 

Designing binomial experi* 
ment 

2 splits (theory) 

2 parallel lines (at distance) 

Distances correspond to one¬ 
sided significance levels at 
percentages to be controlled. 
(AQLandRQL^LTPD) 

9 

p.192 

Designing sampling 

plan 

2 splits (theory) 

2 parallel lines (at distance) 

10 

p.192 

Operating characteristic of 
sign test 

1 paired count 

1 split (at distance) 

Use diort distance 

11 

p. 193 

Sample sue for population 
tolerance limits ] 

1 split 

1 parallel line 

1 horizontal line 

Split-to-line distance desired 
confidence. Sum of counts-in 
determines horizontal 

12 

p.194 

Tolerance limits for second 
sample 

1 split 

2 paired counts (touching) 

Split through Ist and 2nd 
sample sizes; distance to 
ooxumon vertex»oonfidence 

13 

p. 194 

Operating characteristic of 
anova II 

1 point (} degrees of freedom) 

2 splits (at distance) 
compute from ratio of spHt 
ratios 



Fart IV. Several paired ooimte 


14 

p. 195 

k proportions and a theo¬ 
retical proportion 

A paired counts 

1 split (theory) 

2 parallel lines (± 10 mm) 

Expect 1 in 20 outside by 
middle distance. 

16 

p.198 

A proportions and a theo¬ 
retical proportion 

A paired counts 

1 split (theory) 

Combine middle distance by 
crab addition, (see p. 179) 

16 

p.199 

Stabilized jHshart 

A paired counts 
split (assumed level) 

Transfer to tracing paper as 
control chart 

17 

p.201 

Homogeneity of k propor¬ 
tions (A small) 

A paired counts 

1 split (sum) 

Combine middle distances by 
crab addition (see p. 179} 

18 

p.202 

Homogeneity of A propor¬ 
tions (A 20) 

A paired counts 

1 split (sum) 

Range of middle distances 

19 

p.202 

1 

Homogeneity of A propor¬ 
tions (A large) 

A paired counts 

1 split (sum) 

4paral!d8 
(±6 mm, ±10 mm) 

12 outside+3 between—2 

ITtlriHA 

5%. 5.65 A; 1%, 8 A. Use 

20 

P.203 

Homogeneity of A unsym- 
metrical proportions 

as 17 or 19 with large count 
divided 

As 17 or 19 with distances in 
undivided direction 

21 

p.204 

Poui-fold table 

2 paired counts 

1 split (sum) 

Range from middle distances 
















































binomial probability paper 


TABLE 2 

millimeter table for normal deviate 


211 


Significance Level 
one-fiided tvro-sided 


Normal Deviate 

millimeters multiples of 



TABLE 3 

MILLIMETER TABLE FOR CHI^QUARE 

Undoubled Millimeters | Multiples of 

At an Upper Significance Level of 



TABLE 4 

MILLIMETER TABLE FOR NORMAL RANGES 


Number 

of 

Observations 


Millimeters | Multiples of a 

I At Upper Significance Level of 


















212 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


REFERENCES 

[1] American Society for the Testing of Materials, "Manual on Presentation of 
Data,” Philadelphia (1940). 

[2] M. S. Bartlett, “The square root transformation in analysis of variance,” 
Suppl. J. Boy. Stat. Soc., Vol. 3, 1936, pp. 68-78. 

[3] M. S. Bartlett, “The use of transformations,” Biometrics^ Vol. 3 (1947), pp. 
39-^2. 

[4] W. J. Dixon and A. M. Mood, “The statistical sign test,” Journal of the 
American Statistical Association, Vol. 41 (1946) pp. 557“566. 

[6] Acheson J. Duncan, “Detection of non-random variation when size of sam¬ 
ple varies,” Industrial Quality Control, Vol. 4 (1947-48) No. 4, pp. 9-12. 

[6] Churchill Eisenhart, “The assumptions underlying the analysis of vari¬ 
ance,” Biometrics, Vol. 3 (1947) pp. 1-21. 

[7] J. P. English, F. A. Willius, and J. Berkson, “Tobacco and coronary dis¬ 
ease,” J. Am. Med. Assn., Vol. 115, pp. 1327-1328 (1940). 

[8] C. O. Fawcett, et al., “A second study of the variation and correlation of 
the human skull, with special reference to the Naqada Crania,” Biometrika, 
Vol. 1 (1902) pp. 408-467. 

[9] R. A. Fisher, “On the dominance ratio,” Proc. Roy. Soc. Edinburgh, Vol. 42 
(1922) pp. 321-341. 

[10] R. A. !]^sher and K. Mather, “The inheritance of style length in Lythrum 
sdicaria,” Annals of Eugenics, Vol. 12 (1943) pp. 1-23. 

[11] R. A. Fisher and E. Mather, “A linkage test with mice,” Anruds of Eugenics, 
Vol. 7 (1936) pp. 265-280. 

[12] Frederick Mosteller and John W. Tukey (designers). Binomial Probability 
Paper, Codex Book Company, Norwood, Mass. 1946. 

[13] Statistical Research Group, Columbia University, Selected technique of sta- 
tistkal analysis, McGraw-Hill, 1947. 



TEACHING STATISTICAL QUALITY CONTROL FOR 
TOWN AND GOWN* 


Edwin G. Olds 
Carnegie Institute of Technology 

AND 

Lloyd A. Knowleb 
State University of Iowa 

During the next hve years, in manufacturing plants and in 
the engineering schools, there will be many new programs ini¬ 
tiated to meet the demands for education in Quality Control 
by Statistical Methods. This paper has been written with the 
hope that it will be of some help in the planning and executing 
of such programs. Beginning with a description of types of 
courses and a discussion of possible content, it touches on 
the questions of who is to do the teaching and how the subject 
matter might be presented and motivated. The importance of 
follow-up work is stressed. The latter part of the paper dis¬ 
cusses the choice and utilization of instructional aids and de¬ 
scribes some of the materials now available. 

I —TYPES OP COTOSES 

T hese kemakks concerning the teaching of Statistical Quality 
Control will be confined to four general types of courses, each of 
which permits various subdivisions. The four general types will be 
referred to as: (1) intensive or so-called ten-day courses; (2) extension 
and evening courses which, by nature, usually are given on a part-time 
basis; (3) university or college credit courses; and (4) in-plant training 
courses. Naturally these types are not necessarily mutually exclusive. 

Intensive Ten-Day Courses. 

Although statistical work with special reference to applications in 
industry has been given in colleges and universities for many years, it 
is believed that it received a great impetus during World War II be¬ 
cause of the thirty-odd so-called eight-day courses in Quality Control 
by Statistical Methods most of which were sponsored, to a large extent, 
by the War Production Board and the United States Office of Educa¬ 
tion under the Engineering, Science, and Management War Trainmg 
Program in cooperation with various educational institutions. Repre- 

* Presented at Section on Training of Statisticians of the American Statistical Association joint 
with American Society for Quality Control and Institate of Mathematical Statistics at Cleveland, Ohio, 
on December 27,1948. Fftrts I and II weve presented by Uoyd A. Enowler, Part ni, under a separate 
title, was presented by Edwin G. Olds. 


213 




214 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

sentatives from government and industries engaged in production 
directly related to the vrar effort were permitted to attend these tuition- 
free courses. Many of the trainees in the short courses received, for the 
first time, any formal background in statistical procedures. Because 
of some continued need for the type of training which can be given in 
these courses, a few educational institutions have continued to sponsor 
similar courses on a moderate tuition or fee basis. 

The basic course has generally been expanded to ten days. A typical 
ten-day intensive course has the first day set aside for executives. The 
Executives’ Session is devoted to an explanation of the aims and 
possibilities of a quality control program and procedures for the in¬ 
stallation of such a program in the plants is indicated. In addition to 
the executives the session is also attended by the trainees who expect 
to remain the entire ten days. 

The program during the remaining nine days consists of a series of 
conferences, lectures, and laboratory periods. In addition to daily ses¬ 
sions, three or four evening sessions are scheduled. Instructors in the 
courses are continually bombarded with questions during the short in¬ 
termissions between sessions. General lectures are given to the trainees. 
The discussion of specific problems and the working of laboratory 
exercises is facilitated by dividing the general group into small sections. 
While the emphasis is to apply the statistical method to design, speci¬ 
fications, production, and inspection, the evening sessions and informal 
discussions during the day are often devoted to additional phases of 
statistical work. 

The course content in the intensive courses is ordinarily divided into 
two parts: (1) control charts; and (2) acceptance sampling. 

The use of control charts of the usual types (X, R; p; np; c; and c) 
as a production tool is stressed; in fact, the comparison of a control 
chart to a hi^way, the descriptions of a control chart as a picture, a 
newsreel, or as an advertisement of the workers’ product is quite com¬ 
mon. 

In studying a control chart it is usually necessary to introduce the 
concept of a frequency distribution. Following this, or in coimection 
therewith, a control chart is constructed based upon shop data. This 
is followed by a laboratory exercise and discussion. Thereafter demon¬ 
strations of the effect of changes in average or range are given with 
the use of chips or beads as indicated m the second phase of this 
report. This general pattern of instruction is altered occasionally to fit 
the particular needs of the trainees in the course. 

The facts that new uses are being discovered constantly and that 



QUALITY CONTROL 


215 


additional concerns are making use of statistical procedures as more 
individuals are being trained are emphasized. It is pointed out that a 
single chart on a single machine for a single operator in a large coiv 
poration is essentially a one man company. The value of a control 
chart for short runs and its value when applied to new lines of pro¬ 
duction is more than noted. Quality and economy are stressed. The 
diagnostic value and the predictive value of determining when and 
where to look for trouble are also stressed. 

The introduction of acceptance sampling in the basic course pertains 
lai^ely to the Army Ordnance tables, the Dodge-Bomig tables, and an 
introduction to sequential analysis together with a demonstration and 
discussion of the defects of 10 per cent samplii^. 

Some attention is given to the construction of a sampling plan, 
largely to biing out the value of an operating characteristic curve and 
an average outgoing quality curve. The use of a certified control chart 
in the manufacture of goods to replace acceptance sampling, or as an¬ 
other type of acceptance sampling, has been advocated by the presi¬ 
dents of at least two large corporations in this country who use and 
who realize the value of control charts. 

In addition to a regular staff in chaise of the course, three or four 
periods are given over to representatives from different industries 
using statistical quality control who discuss the practical applications 
in their own control problems. 

It has been foimd fruitful, in some instances, to maintain a rather 
^ressive follow-up program including at least two two-day clinics in 
which the trainees attending the course meet with representatives of 
the instructional staff to discuss the application of quality control to 
specific manufacturing problems, to check on the correct application 
of statistical procedures, and to provide an opportunity for exchanging 
information. These foUow-up meetings are exceptionally wdll attended. 
In some instances additional statistical topics such as correlation, 
analysis of variance, design of experiments, chi-square, and other tests 
of significance are considered. For the greater part, these additional 
techniques are taken up in section meetings which tend to form as a 
result of the short courses. 

It has also been foimd desirable to give so-called advanced courses of 
eight days’ duration. The topics included in a typical course are: a 
review of control charts; significance of differences; analysis of variance; 
correlation—^linear and multiple; further aspects of acceptance sam¬ 
pling; further aspects of sequential analysis; chi-square; and use of 
calculating machines. An instructional pattern, similar to that used for 



216 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

tibe dementary courses, is followed and also follow-up programs are 
maintained. 

The usual experience in these intensive courses is to find that, 
initially, representatives from industries want and demand only so- 
called practical materials. After taking the course and making some 
uses of their procedures, they learn the value of statistics and then 
decide that the difference between practice and theory is rather fuzzy, 
if a difference even exists. In fact, many take the view that what works 
in practice does so because it has a reason and that the reason is the 
theory. It is these “practical” persons who ordinarily request additional 
“theoretical” work. In fact, they often ask that what might be a semes¬ 
ter’s or year’s work in mathematical statistics be presented during one 
forenoon, possibly in one hour; not only be presented but be set out in 
such a way that they can go back to their plant and use it in the par¬ 
ticular problem at hand. 

Extension and Evening Courses. 

Many of the extension and evenii^ courses given once or twice a 
week, occasionally once a month, follow the same general pattern as 
the intensive courses although to some extent many of the discussions 
are a little more formalized. Naturally the material on each topic tends 
to be more self-contained to the extent that it is usually unnecessary 
to attend the first lecture or two in order to take something away 
from the later ones. 

It is encouraging to note that enrollment tends to be maintained 
in these evening courses even though a year may be spent in one series. 
It is also encouraging to note that the same individuals find it profit¬ 
able to repeat a series so as to get new ideas from different lecturers or 
instructors. 

There is a trend toward offering an advanced, as well as an ele¬ 
mentary, series of lectures. 

University and College Credit Courses. 

The content of university credit courses may not be so clearly de¬ 
fined. Because of the broad aspects of mathematical statistics, each 
person who is instructing a course of this type has had different experi¬ 
ences; it would be natural for them to draw upon their experiences to a 
considerable extent. It is believed, however, that through discussions 
such as this an approach to standardization will be effected. 

The catalog description of such a course might read: “The theory 
and applications of that part of mathematical statistics used in main- 



QTTAUTT CONTROL 


217 


taining control of the quality of a manufactured product or of a service; 
in the construction and use of acceptance tests, and the associated con¬ 
cepts of the oi)erating characteristic curve and the average outgoii^ 
quality”; or references might be made to “Elementary statistical 
methods and their application to industrial problems; construction and 
interpretation of Shewhart control charts; Scaly’s modified techniques; 
Dodge-Eomig and Army Ordnance Tables for acceptance samplmg; 
quality assurance for sampling by measurement; introduction to se¬ 
quential analysis; methods of correlation; elementary analysis of vari¬ 
ance.” 

A course in Statistical Quality Control should be given in that col¬ 
lege or'division where it is accessible to the student—^not merely pos¬ 
sible for him to register. It is relatively easy to effect this result in 
some schools—^in others a special problem may be created. In so far 
as engineering colleges are concerned, a course in Statistical Quality 
Control might be given during the junior, senior or graduate year. In 
such instances the student will already have studied through the cal¬ 
culus, hence many more of the problems can be approached directly. 
The student, through his shop experiences, has a better idea as to the 
working of a machine, if not the working of industry. 

For that reason, a three semester hour course following the general 
pattern of the recent avalanche of books on Statistical Quality Control, 
with such supplementary reports as the instructor desires, may orient 
the averse student about as far as is necessary. The developments 
and illustrations of various principles would naturally be supplemented 
by demonstrations such as will be pointed out in the third part of this 
report. 

Many students will observe the desirability of more work. Addi¬ 
tional work in mathematical statistics may be taken immediately or 
after a year or two of experience. The additional topics which could be 
covered in a subsequent course or courses will be of considerable value 
for the quality control engineer of a large company. In fact, it is very 
likely that it will be of value to the quality control engineer in a smaller 
concern. It is not unusual for executive officers of laige corporations to 
seek a quality control engineer with a background in shop and only a 
little work in statistics. After some understandii^, however, the re¬ 
verse is true. The demand for well trained statisticians is well known 
today. Much of this demand has been augmented, however, by initial 
applications under the direction of open-minded industrial executives. 
In fact, it might be observed that these persons have been most ef¬ 
fective promoters and have also made possible many interesting prob- 



218 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


lems for investigation. They have not been backward in advocating the 
need in their company for a person -vidth training equivalent to a Ph.D. 
in mathematical statistics. These executives are particularly insistent, 
however, that the quality control engineer with such advanced training 
spend considerable time in shops and in assisting in the applications. 
They have realized that the main reason many executives and super- 
\usors do not use mathematical statistics is that they have little or no 
knowledge of it. 

In-Plant Training Program. 

The in-plant training program is a very important phase of statistical 
quality control. Some of tiie larger companies have held a two-day 
training program for their top executives in which they attempt to 
give a rather broad overall picture of the subject. A program to achieve 
such an objective might be as follows: an introduction to statistical 
quality control; construction and interpretation of X, R control 
charts; discussion of additional t3rpes of control charts; mtroduction to 
acceptance sampling; and suggestions on puttmg quality control to 
work. Executives are very much interested in such a program. They 
are particularly interested in the demonstrations which seem to have 
some relation to their plant operations. 

Following this two-day program, the executives then select an indi¬ 
vidual from each plant in their company to take a ten-day course either 
sponsored by the company or by an educational institution. If within 
the company, the material considered is that pertinent to their own 
problems. The quality control engineers then return to their separate 
plants and make a pilot run as an application. They then train persons 
of supervisory and operator level in those particular aspects in which 
those persons happen to be most interested and for which they feel the 
greatest need. This may be the use of control charts for production or 
for design, or it may be some aspect of acceptance sampling. In the 
meantime, the quality control engineers join various statistical soci¬ 
eties, start a library, and continue their study. It is not unusual for 
them to seek personnel from the educational institutions to assist 
them. 

The procedure of having the in-plant training program divided into 
several periods wherein classroom discussions of a week or two are fol¬ 
lowed by a week or two of work in the plant, and the process repeated, 
has a great deal of promise. 



QUALITT CONTEOL 


219 


II— ^PREPABATION OP TEACHERS 

Since statistical quality control includes so many other modem 
statistical techniques in addition to control charts and acceptance 
sampling, it might not be out of order, in a paper such as this, to make 
a few remarks concerning the preparation of teachers. It is recognized 
that much depends upon the individual in charge of the course. In the 
first place, it is essential that the teacher have a knowledge of the 
subject and be interested in transmitting that knowledge or in getting 
another individual to gain similar knowledge. As with so many profes¬ 
sions, the work of a quality control engineer is getting beyond the cook 
book style. It is rapidly becoming essential that the prospective teacher 
study two or three years of mathematical statistics, with one or two 
of these years in the Graduate College. In order for him to do this with 
much facility it is desirable, if not necessary, that he have a rather 
thorough background in pure mathematics. Also, the prospective 
teacher needs a background in some area such as engineering, business 
or industry from which to draw information. This is a large order, to 
be sure, but we need to realize that teaching is a profession and that it 
should be so recognized, both in the preparation required and in mone¬ 
tary rewards. 

Ill —INSTRUCTIONAL AIDS 

In the previous sections we have discussed what material might be 
taught, how it might be organized and motivated, and who should do 
the teaching. In this section attention will be directed toward the work¬ 
ing tools needed and how they can be used to clarify quality control 
principles and develop power in analyzing manufacturing problems. 
Teachers in many fields seem to be capable of presenting their subjects 
acceptably when equipped with nothing more than a basic text, a port¬ 
able blackboard, a stick of chalk and an eraser. For many, in fact, this 
seems to be the maximum list of requirements. Such does not seem to be 
the case for statistical quality control where successful teachers find 
effective use for a large collection of various kinds of paraphernalia. 

Classroom and laboratory equipment can be classified under five 
general headings: (1) Textbooks; (2) Material for supplementary read¬ 
ing; (3) Problems; (4) Calculating machines; and (5) Gadgets. Some 
brief comments on each may prove hdpful to embryo teachers. 

Texts. 

To remark that the success of a course depends, to a consid^able 
extent, on the choice of text is distressingly trite, but in this case, it 



220 


AimtlCAN STATISTICAL ASSOCIATION JOTONAL, JTJNB 1940 


seems necessary. When the Office of Production Research and Devel¬ 
opment initiated its statistical Quality Control Program early in 1943, 
pitifully few books on the subject were available, and most of them 
would be classified as treatises, rather than texts. Since the war, pub¬ 
lishers have been able to put on the market a considerable number of 
books with the Quality Control label, but it seems fair to assume that 
most of the books were not prepared for use as basic texts in the field. 
Before any of them are adopted for the classroom the following ques¬ 
tions should be considered: 

1. Does the book give a sound exposition of the general philosophy 
and statistical principles basic to statistical quality control? 

2. Is the exposition of sufficient breadth and completeness to meet 
the needs of the course? 

3. Is the levd of sophistication appropriate, in view of the students 
for whom the course is planned? 

4. Will the book generate enthusiasm for statistical quality control 
as an engineering tool? 

5. Is the arrangement and problem content such that effective day- 
by-day assignments can be made? 

It would be surprising to find many books meeting all five tests satis¬ 
factorily. A book written to sell Quality Control to the busy executive 
would miss its aim if it attempted more than a brief overview of the 
simplest methods together with a briefer explanation of why they work. 
A manual prepared for use in the shop seldom will contain any deriva¬ 
tion of the formulas being applied or hint as to many of their limita- 
tions.On the other hand, a book directed at advanced undeigraduate or 
graduate engineering students with calculus, and perhaps some ele¬ 
mentary statistics “imder their belts” would be tragic for an in-plant 
training course at the supervisory level. 

The rapidity with which manufacturing organizations have adopted 
statistical methods for the improvement of their operations can be at¬ 
tributed, to a large extent, to the missionary efforts of industrial men 
who were trained in short wartime courses. Reports of savings in man¬ 
power and materials, together with improvement in quality spurred 
them to study the methods, to apply them to their own problems, and 
to spread the gospel. There is no diminution of need to whip up m- 
thusiasm but, with the frenzied haste of war work at an end, there is 
more rime for careful appraisal of success stories. Furthermore, the 
serious student is more likely to be impressed by the powerful nature 
of statistical methods than by dollars saved. In no sense does this 



QTTAIJTT CONTEOIj 


221 


conunent imply that case studies should be deleted from a text or 
course, but rather that they should be choseu cautiously aud imbedded 
in enough detail so that the student has some opportunity to check the 
correctness of the solution. 

Students differ in the amoimt of verbiage they need, but almost \nth- 
out exception, they need problems to clarify and emphasize principles. 
Problems daily, like apples, seem to have a most beneffcial effect. Texts 
'without problems, or with problems invol'ving excessive amoimts of 
mechanical manipulation and calculation, place a heavy burden on 
student and teacher. It does not follow that, simply because a set of 
data originated in a plant, it will be an efficient hammer to drive home 
a principle. 

One further remark on the choice of a text. A successful book or 
course does more than arouse interest, inculcate principles, and develop 
skills; it must create and foster the urge for continuation of the 
learning process; and it must chart the path and locate some of the 
important land marks. In other words the book should contain a 
generous and appropriate list of references which will carry the student 
beyond its boundaries in all directions. 

Material far Supplemen,tary Reading. 

Even if a text 'with a satisfactory set of references is located, it is too 
much to expect either that all useful materia 'will be included or that 
all of the papers listed -will be readily available. For those readers who 
have examined university, city or company libraries this lack of avail¬ 
ability need not be elaborated. Therefore, the teacher of statistical 
quality control 'will find it necessary to compile his own list for outside 
reading and find a way of giving his students easy access to aay unpub¬ 
lished reports which they should see. The authors of this paper are 
much relieved that there is no room in this paper to include recom¬ 
mendations regarding books to purchase. Fortunately, many of the 
leading statistical, mathematical and engineering journals not only 
list new books received, but also print authoritative and prompt re¬ 
views of many of them. At least two bibliographies on statistical quality 
control [1] [2] have been compiled rather recently. Both books and 
periodicals are listed, but the field of coveri^e is largely restricted to 
control charting and acceptance sampling. Furthermore, both -will be 
somewhat out of date by the time this paper reaches publication. 

The bi-monthly journal, indvstried Qtudity Central, is the official 
publication of the American Society for Quality Control and, naturally, 
caters to the needs of its memb^. These nee^ are not greatly differ- 



222 AMEBICAN STATISTICAL ASSOCIATIOK JOtTBNAL, JUNE 1949 

ent from those of beginning students in the field and, therefore, the 
publication is a rich source of material. The Supplement to the Journal 
0 / the Royal Statistical Society also specializes in industrial applications 
of statistics, but at a somewhat higher level. Also, useful papers will 
be found in the Journal of the American Statistical Association and 
Biometrika. 

The Annals of MathemaJtical Statistics is devoted, primarily, to basic 
theory, but some of the theory is directly applicable to quality control 
problems, even under the narrowest meaning of the appellation. 

A set of twelve Quality Control Reports, prepared at the close of the 
war under the auspices of the Quality Control Program at Carnegie 
Institute of Technology, is still available and can be obtained at a cost 
of two dollars from the Office of Technical Services, Department of 
Commerce, Washington. In the main, these are case histories of initia¬ 
tion of quality control methods. 

Excellent papers on statistical methods and their applications ap¬ 
pear from time to time in many scientific, engineering and trade jour¬ 
nals. These may pass unnoticed unless students are briefed to watch 
for and report them. Several of the societies hold occasional symposia 
on statistical methods. Usually the reports of these meetings are worth 
locating. Unpublished plant reports are useful, particularly if prepared 
by former students. 

Sources of Problem Material. 

There is a dearth of satisfactory problem material for easy reference. 
Most teachers of statistical quality control do not hesitate to b^, 
borrow, or even purloin all of the industrial data they can. But most 
files contain pitifully few really good problems. Perhaps “good prob¬ 
lem” should be better described. Mr. Wyatt Lewis, of the Ontario, 
California plant of General Electric Company, contributed such a 
problem for inclusion in the Outline Manual for Quality Control by Sta¬ 
tistical Methods, which was written by Working and Olds and used by 
them and by many others 'in the teaching of intensive eight-day 
courses during the war. (The Manual has been available for two dol¬ 
lars from the same source as the Quality Control Reports mentioned 
above.) The problem is called the Rheostat Enob problem and is an 
application of the Shewhart Control Chart techniques. Data on an 
easQy described quality characteristic is given for several days’ produc¬ 
tion for which both specifications and manufacturing conditions were 
changed. Interesting details are supplied. The problem divides nat¬ 
urally into several parts, eadi of which can be assigned as a separate 



QUALITY CONTROL 223 

exercise. When a calculating machine is used or when the numbers 
are coded, the time for computation is not excessive. 

The above-mentioned manual has a few other good problems. 
Manuals prepared for intensive courses given more recently at various 
universities have others. Recent books have some. Others can be manu¬ 
factured by judicious use of the data and circumstances given in tech¬ 
nical articles. Good problems can be dug up, but a modicum of work is 
required—and, perhaps, a little harmless cMcanery. 

Calculating Machines. 

Recently one of the authors read about an elementary course in 
statistics which was said to have been given quite successfully ("not 
ideally!”) without the use of a computing laboratory. A similar remark 
might be made regarding beginning courses in statistical quality con¬ 
trol. Obviously it is debatable and would not apply to courses at an 
advanced level. Statistical quality control can be done without calcu 
lating machines, and there seem to be two principal arguments for such 
a procedure: (1) economy; and (2) the distressing fact that some 
industrial organizations do not furnish machines for their quality 
control departments and so their men need to learn how to get along 
without them. 

For the most effective teaching of statistical quality control the 
authors agree in favoring a computing laboratory with sufficient equip¬ 
ment to accommodate a class of students. Not all classes would be held 
in the laboratory but, when used for a class, each member would have 
an automatic calculator of the same model. Needless to say, the lab¬ 
oratory should be kept open at stated hours for individual work, with 
an assistant in charge, competent to teach the operation of the ma¬ 
chines and responsible for their ordinary care. 

It is useful to have late models of the principal kinds of calculators 
available, so as to broaden the students' experience. Punch-card equip¬ 
ment is a welcome addition. Several of the gadgets to be described in 
the next section are a proper part of laboratory equipment. So are 
statistical tables and some reference books. If space and supervision is 
adequate, there is little danger of collecting too much equipment. 


In days of fairly recent yore, the average teacher of elementary 
probability reached the heights when he had each of his students toss 
a coin 100 times, recording the succession of heads and tails. Then he 
was content to retreat to a consideration of either the few reports of 



224 AMEBICAIT STATISTIOAli ASSOCIATION JOUBNAL, JtINE 1948 

such experiments in the literature or "cooked-up” examples of the 
same t3rpe- It is hard to understand why he failed to appreciate the 
pedagogical value of designing an experiment to illustrate a point of 
theory, predicting the result, running the experiment, and then taking 
the consequences if it turned out wrong. Whatever the reason, it is for- 
timate for the field of statistical quality control, that its leading 
teachers have broken away from traction and have shown no hesita- 


oooooooooc 

oooooooooc 

oooooooooc 

oooooooo^ 

oooooooooc 

) 

j 

_1_ 


T 

i 

1 

« 

-- 





EXEIBIT 1 

PLAN FOR so HOLE SAMPLING PADDLE 
(adapted from a drawing by H. B. Eogers) 


tion in using any available gadget which promised to assist in fixing a 
concept in the student’s mind. 

At some time during the last five or six years not only thousands 
of students, but scores of corporation presidents and their associates 
have been introduced to the vagaries of acceptance sampling by means 
of a box of beads and a 50-hole sampling paddle. The authors do not 
know who originated the paddle, but the man who made it famous 
was Holbrook Working. His paddle (see Exhibit 1) started as a wooden 
board about fi'Xfi'Xl'. In it were sunk five rows of ten holes each, 
about 9/16' apart, center to center, 7/16' in diameter and 13/32' 



QUAIiITT CONTROL 


225 


deep. Near one edge the board was grooved to provide a handle. The 
rest of the equipment consisted of a cardboard box of several hundred 
10-nun woodmi beads (about 1200 white and 200 red). When Dr. 
Working went to the Walco Bead Company, 37 W. 37th St., New 
York City to purchase the beads and explained his purpose he met with 
astonishment, if not incredulity. 


G OO 

O OO 

O OO 

K 

*N 

» 

y 

A 

-H 

■ CQ 

'r 

-- ^ ' 

.9. ^ii[,-Z. jh 
^ z ^ 




) 

/ 





EXHIBIT 2 

FLAN FOR 5 AND 10 HOLE SAMPLING PADDLE 
(adapted from a drawing by H. B. Rogers) 


The best known demonstration is concerned with illustrating the 
weaknesses of 10 per cent sampling. In the box 1152 white beads 
(good items) are mixed with 48 red beads (defective items). A paddle of 
beads (representing a lot from a controlled process) is withdrawn, the 
munber of red beads counted, and the fifty beads dumped into a second 
box. Then, with a 5-hole paddle (see Exhibit 2), five beads are scooped 
from the fifty and number of defectives noted. The agreement is to 
accept the lot only if th^ are no defectives in the sample. Of course 
some of the lots accepted are worse than many of the lots rejected. 
Also the per cent defective in the uninspected portions of rejected lots 



226 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


EXHIBIT 3 


RECOMMENDED FREQUENCY DISTRIBUTIONS FOR SETS OF DISKS 
FOR SAMPUNG FOR VARIABLES* 


Set No. 

lA 

IB 

10 

2 

3A 

3B 

4 

Color of Figures 

Green 

Black 

Black 

Blue 

Black 

Red 

Red 

Mean 

0 

0 

0 

+2 

0 

+1 

+4 

Standard Deviation 

1.716 

1.715 

1.715 

1.715 

3.470 

3.470 

1.715 

Numbers 




IVequency 




-10 





1 



- 9 





1 

1 


- 8 





1 

1 







3 

1 







5 

3 







8 

5 


- 4 

3 

3 

3 


12 

8 


- 3 

10 

10 

10 

1 

16 

12 


- 2 

23 

23 

23 

3 

20 

16 


- 1 

39 

39 

39 

10 

22 

20 

1 

0 

48 

48 

48 

23 

23 

22 

3 

1 

39 

39 

39 

39 

22 

23 

10 

2 

23 

23 

23 

48 

20 

22 

23 

3 

10 

10 

10 

39 

16 

20 

39 

4 

3 

3 

3 

23 

12 

16 

48 

5 

1 

1 

1 

10 

8 

12 

39 

6 




3 

5 

8 

23 

7 




1 

3 

5 

10 

8 





1 

3 

3 

9 





1 

1 

1 

10 





1 

1 


11 






1 



200 

200 

200 

200 

201 

201 

200 


* It is necessary to have sets designated as lA, 2 , 3A and 3B, or sets differing little from these as 
regards means and standard deviations Sets 1 A, 2, and 4 will be thrown together for one demonstration« 
and bear figures in different colors to permit subsequent sorting. Sets IB and IC are used in a demon¬ 
stration that calls for drawing from 5 bowls, 3 of which are alike. 

The chips are white plastic discs, 9/16* in diameter and 3/32* thick. They can be purchased from 
Lamb Seal and Stencil Company, 824—13th Street, N.W., Washington, D. C. The wooden beads 
used in the sampling demonstrations are 10 mm. in diameter. They can be purchased from the Walco 
Bead Company, 37 W. 37th Street, New York City. A supply of 1200 white, 200 red, and 100 blue 
beads is suggested. 


is about the same as the per cent defective in the uninspected portions 
of accepted lots, etc., etc. 

It requires considerable restraint on the part of &e authors to pre¬ 
vent them from devoting an entire paper to a description of the various 
uses of the paddle in connection with standard sampling tables, control 








QTTAUTT CONTROL 


227 


charts, tests of significance, confidence intervals and the like. A second 
paper could be written on “chips drawn from a bowl.” White plastic 
chips about f" in diameter and S/lfi” thick are marked with numbers to 
simulate a normal distribution. Exhibit 3 gives suggested distributions 
for the various bowls which were foimd useful in O.P.R.D. courses. 
(These distributions seem to have been devised by Holbrook Work¬ 
ing.) 

Drawii^ samples of five from one of the bowls provides the data for 
a control chart. Changing bowls gives a graphic picture of the effect 
of a changed mean or standard deviation. With two bowls the Tneaning 
of Fisher’s t-test becomes more clear. Correlate pairs of units and the 
presence of sample correlation when the xmiverse correlation is zero is 
demonstrated in a fashion more telling than any appeal to theory. 
Use three bowls or more and analysis of variance can be introduced 
successfully. Possible demonstrations for stratified or sequential sam¬ 
pling are easy to plan and execute. The list of other useful demonstra¬ 
tions is almost unlimited. 

Is it true that in random assembly, the square of the natural toler¬ 
ance of an addition of several components is equal to the sum of the 
squares of the natural tolerances of the components? Professor Mac- 
Crehan has a set of 100 blocks to demonstrate that this is good theory. 
There are five sets of painted blocks; red, black, white, yellow, and 
green. In each set the twenty blocks vary in thickness. Assemblies are 
made against an upright board, on which are painted pairs of lines 
bounding the natural and absolute tolerances for the assembly. When 
five blocks, taken at random, one from each set, are piled against the 
board, the top of the pile falls between the two natural tolerance limits. 
Yet five blocks can be found which will reach the upper absolute toler¬ 
ance line or just match the lower absolute tolerance. 

A less spectacular, but none-the-less, useful gadget is a table of ran¬ 
dom sampling numbers. Several such tables are available [3], [4], [5]. 
While the same jobs can be done with a table as with the ships or beads, 
the use of a random sampling table at the elementary level does not 
seem to be very effective. At a higher level such a table is almost invalu¬ 
able in connection with distribution theory. Suppose, for example, one 
one is forced to invest^ate the behavior of statistics in random samples 
from a very odd universe, one for which the density function might 
even be unknown. Numbers from the table can be so assigned that the 
universe is simulated in a form convenient for sampling. Then a large 
number of samples can be drawn easily and quickly, and the statistic 
under scrutiny can be calculated, tabulated, and studied. No further 



228 AMBBICAN STATISTIOAL ASSOCIATION JOTJBNAL, JUNB 1949 

work may be needed to arrive at satisfactory answers to many practical 
problems. 

Having emphasized the point that gadgets are useful and lacking 
space for detailed description, the authors close this section with a list 
of several other physical aids to instruction. These gadgets, as well as 
those listed above can be used by students as well as teachers. In fact, 
students can learn some statistics by devising their own gadgets. 



(from a photograph supplied by W. £. Gibbons) 

1. Galton Board or Quincunx 

2. Slot machines, seized in police raids 

3. Sets of dice—^fair and biased 

4. Slides and movies 

5. Colored plastic balls or ball bearings or marbles (These may be 
used with a paddle instead of beads. Both bearings and marbles 
are heavy and noisy.) 

6. A can, for mixing chips, mounted on a phonograph tiun table. 
The can is tilted so that the audience can observe the mixing pro¬ 
cedure. 

7. Pieces of wooden doweling of an assortment of lengths and diame¬ 
ters, with weights of pieces marked on them. This gadget is useful 
in explaining multiple correlation. 

8. A sampling machine designed by T. H. Brown, and D. H. Leavens. 
Beads are mixed by revolving a closed container ^aped like an 



QTrAMTY CONTROI. 229 

oil can. A sample is obtained by tilting the can so that beads roll 
into its transparent spout. 

In addition to the gadgets built to provide for striking demonstra¬ 
tions of principles, the various mechanical devices concocted to help 
with practical application are deserving of mention. Exhibit 4 is a 
sketch of a slide rule being used effectively for sequential sampling. A 
slide-disk calculator has been designed for calculating standard devi¬ 
ations. Another type can be used in the shop to get control limits for 
average, range, and fraction defective charts. At least one quality 
control engineer inserted baffles in sheet metal cans of various sizes in 
such a fashion that they w'ould mix lots of small parts and separate out 
random samples of specified sizes. 

The authors have made no attempt to give an exhaustive list of gadg¬ 
ets. No doubt many readers have others of comparable merit. If so, 
it is to be hoped that careful descriptions of such gadgets, together with 
reasons for their construction and examples of their use, will be written 
and submitted for publication. 

AnoENDtrsi 

At the suggestion of a referee, who was present at the meeting when 
the above paper was presented and discussed, some brief comments 
are being added. They may delineate more clearly the nature of sta¬ 
tistical quality control and indicate some of the main objectives of 
elementary courses in the field. 

Statistical quality control is the application of statistical methods to 
the improvement of the manufacturing operation. At each stage in the 
life cycle of a product, from recognition of consumer need, throu^ 
design, specification, fabrication and inspection, to final assurmice of 
consumer satisfaction, statistical method can play an effective role. 
All brands of quality control are concerned with the same problems but 
only staiisticoi quality control utilizes statistical methods in solving 
them. 

Industrial statistics and statistical quality control have many con¬ 
cepts and techniques in common but the dual classification is neces¬ 
sary. Many executives in charge of manufacturing regard industrial 
statistics as closely akin to business statistics and, therefore, mainly 
preoccupied with questions arising in sales and accounting. While 
these executives recognize the importance of such questions, they are 
primarily interested in problems of manufacturing. Therefore, they 
welcome any potential aid to quality control engineering which sta¬ 
tistical methods may have to offer. 



230 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


In a one-semester or two-semester course, or in intensive courses, 
there is neither the expectation nor the implication of transforming 
engineers into statisticians. The a%’erage engineer does not have the 
prerequisite mathematical background for much theoretical statistics, 
ilore preparation would be useful but it is a choice between taking 
him as he is or missing the opportunity to pro^ude him with a few of 
the fundamental concepts and methods which he can grasp readily and 
apply with confidence. Most teachers try to pack as much statistics 
as possible into the time allotted. While statistical quality control 
should certainly reach beyond control charts and acceptance sampling 
by attributes, these two topics crowd a one-semester course. Paren¬ 
thetically, this may be the reason why some engineers view statistical 
quality control as comprising only control charts and acceptance 
sampling. 

In conclusion, it might be noted that there is little danger that a good 
teacher will encourage his students to \iew themselves as statistical 
experts. It seems to be more general to find students of statistical 
quality control very conscious of their limitations. This has two good 
effects: they are willing to seek advice and they are eager to learn 
more statistics. 


REFERENCES 

[1] Bibliography on Statistical Quality Control,^ Report No. 11 of a series of 
Quality Control Reports published under the OPRD Quality Control Pro¬ 
gram, Carnegie Institute of Technology, Pittsburgh, 1945. 

[2] G. I. Butterbaugh, “-A Bibliography of Statistical Quality Control,” University 
of Washington Press, Seattle, 1946. 

[3] L. H, C. Tippett, Random Sampling Xumbers,” Tracts for Computers, No. 
15, Cambridge University Press, London, 1927. 

[4] M. G. Kendall and B. Babington Smith, Tables of Random Sampling Numr 
hers,” Tracts for Computers, No. 24, Cambridge University Press, London, 
1939. 

[5] R. A. Fisher and F. Yates, Statistical Tables for Biological, Agricultural, and 
Medical Research, Oliver and Boyd, London, 1938. Table 33 is a table of 
random numbers. 



THE USE OF SAMPLING IN GREAT BRITAIN* 


0. A. Moser 

Assistant Lecturer in Statistics, London School of Economics 

The object of this article is to survey the use of sampling 
methods in Great Britain. The most important sample surveys 
undertaken by government departments, research organisa¬ 
tions and commercial agencies are described with particular 
reference to their aims and sampling methods. Deficiencies in 
British sampling practice are discussed and suggestions for 
possible future developments are made. 

INTRODUCTION 

O NE OF THE most notable developments in the field of statistics in 
recent years has been the increasing utilisation of sampling meth¬ 
ods in the study of human populations. Although a number of coun¬ 
tries have shared in this development, American advances in tech¬ 
niques and applications of sampling have been the most striking—^the 
more to be valued because they have been so fully described in numer¬ 
ous papers in the technical journals, more especially the J.A.8.A. 

In Great Britain, the situation is rather different. Not only is the use 
of sampling more limited than in the United States but, for various 
reasons, the surveys that are carried out are rarely published. The 
result is that statisticians elsewhere have little opportunity of becom¬ 
ing familiar with the fields in which sampling is used in this country, 
and the methods employed. 

The reports issued in conjunction with the meetings of the U.N.O. 
Sub-Commission on Statistical Sampling [1] indicate both the consider¬ 
able use that is being made of sampling in many countries and the need 
for widespread collection of data and exchange of ideas on sampling 
methods. It is the purpose of this article to describe and comment on 
the more important sampling investigations carried out in this coun¬ 
try by Government departments, commercial organisations and other 
bodies; and to discuss the main features of English sampling practice, 
pointing out deficiencies and possible developments in the use of sam¬ 
pling here. Discussion will be confined to investigations concerned with 
human populations leaving out of account, for instance, research car¬ 
ried out on agricultural experimentation, where sampling methods of 
a far more involved and refined character are employed. Throu^out 

* My tTiftnTra aie due to statisticians in Gkivemment departments and elsewhere for making avail¬ 
able to me mudi of the information on which this artide is based. 


231 




232 


ATtfUjUTflA-W STATISTICAL ASSOCIATION JOITBNAL, JONS IMS 


the article, emphasis is on the fields of application and sampling pro¬ 
cedures, rather than the results of surveys. 

TTTB DEVELOPMENT OP SAMPUNQ IN GBEAT BBITAIN 

The present state of sampling in this country is best seen against the 
background of a long dravm out development. The first proper use of 
sampling techniques was in Professor Bowley’s survey of Reading in 
1912 [2]. Re took approximately every 20th working-class household, 
paying great attention to the calculalaon of sampling errors and to the 
possibility of bias being introduced through substitution and refusals. 
This pioneering use of sampling by Bowley proved a great stimulus to 
social surveys, which had hitherto been based on non-random selection 
(Booth, 1889) or complete enumeration (Rowntree, 1901). AU the major 
surveys of the inter-war years, such as the New London Survey, Mer¬ 
seyside, Southampton and others, follow to a greater or lesser extent 
the sampling methods first used by Bowley. 

The first use of sampling in connection with ofiGicial information was 
in John BQlton’s enquiry into “The Personal Circumstances and In¬ 
dustrial History of 10,000 Claimants to Unemployment Benefit.” Ex¬ 
treme care was taken to achieve a representative sample (the 10,000 
workers were approximately a one per cent sample of all the claimants) 
and to avoid the many possible sources of bias. The sampling pro¬ 
cedure was fully described by Professor Hilton [3] and, as F. F. Stephan 
points out in a recent paper [4], it is odd that his methods were not 
imitated by other Government Apartments. It was not, in fact, imtil 
1937 that sampling was used in any large-scale Government investiga¬ 
tion. In that year, the Ministry of Labour undertook its enquiry into 
working class expenditure [5] with the object of furnishing information 
on which a revised Cost-of-living Index could be based. Budgets of 
expenditure for four weeks (spaced at quarterly intervals) were ob¬ 
tained from about 10,000 households—^the initial selection of house¬ 
holds having been made by sampling at random from various registers. 
The planned revision of the Cost-of-living Index was interrupted by 
the war, but the new Interim Index of Retail Prices, which has taken 
the place of the old Index, is based on the results of the 1937/8 sur¬ 
vey [6]. 

The 'thirties also saw the begmning of listener Research and other 
Opinion Research bodies, but it was the war which, as in America, 
gave the decisive impetus to the utilization of samplii^ techniques. 
The need for quick and cheap information led to the foundation of an 
official Social Survey Unit and to other projects which will be de¬ 
scribed in tiiis article. 



SAMPLING IN GBEAT BRITAIN 


233 


THE NATIONAL LISTS 

Perhaps the most important single difference between British and 
American sampling practice lies in the existence, in this country, of 
several lists covering the population; certainly, this is the key to what 
may appear to be simply lack of enterprise—our failure, so far, to make 
more use of modem sampling developments, and particularly area 
sampling. It will be seen that nearly all the major surveys to be de¬ 
scribed are based on samples selected from one or other of the lists. 
For that reason, and because the coverage and accuracy of the lists 
differ, a short note on each of them will be useful at this point. 

(1) The Maintenance Register 

This national register started in September 1939, when National Regis¬ 
tration came into force. It is kept at local Food Offices and covers the whole 
population, with the exception of the Armed Forces and Seamen. The 
Register is a live one, in so far as cards of persons, who have moved out of 
or into a district are reffied fairly rapidly in the new district. There are 
separate files for persons under 16 and 16 and over, the cards being filed 
in order of code number (depending on the district where the card was 
initially issued). 

Apart from the code-numbers, the card ^ves details of the person’s 
name, sex, address and age (last birthday and date of birth). It must be 
said that not all the cards have been filled in with all these details and 
there is a certain amount of inaccuracy; but efforts are being made to 
bring the Register up to a higher standard of accuracy and completeness. 

It will be seen that it is simple to draw a random sample of under 16's or 
16’s and over, or any other large age and sex group from this re^ster. 

(2) The Ministry of Food Files 

The civilian Ration Books in this country last for a period of one year 
and there is an exchange of new for old books every July. At this time, the 
reference pages are extracted from the old books and are filed in alpha¬ 
betical order of surnames at the local Food Offices. Unlike the Maintenance 
Register, the Ministry of Food file is only brought up to date once a year so 
that at any time during the year, the Register in any particular area will 
include persons who have died or have moved out of that area since last 
July. Dummy pages are inserted for persons moving into the area. The 
Register is divided into three separate groups:—^under 5, 5-18, over 18. 

(3) The Rating Lisf 

The Rating ledger, kept by the Rating Officers of local authorities, is a 
list of all the rateable units in the area. It is generally in order of wards 


1 “Rates* ooirespond to U. S. local taxes. 



234 


AMEEICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


and in alphabetical street order within wards; within streets, the order is 
simply by street-number. The entry for each dwelling unit (and, for rating 
purposes, flats are separate units) gives a description of the property 
(house, flat, cinema, shop, etc.) and the names of both the owner and 
occupier. As these lists are used for rating purposes, they are fairly accu¬ 
rate and up-to-date. By eliminating those types of property not required, 
it is clearly possible to form a sample of dwelling units. In order to sample 
successfully for households from this list, allowance must be made for the 
existence of multiple households and the difficulties arising from the dis¬ 
tinction between households and dwelling-units. 

(4) The Electoral Rolls 

This is a list, published annually, of all persons entitled to vote in elec¬ 
tions; that is, broadly speaking, British subjects aged 21 and over. It is 
arranged in the same order as the Eating List but, of course, gives not 
only the occupier of, say, a flat, but all persons in the flat who are eligible 
to vote. The pre-war list of 1939 was especially useful, in that the head of 
the household and his wife were indicated by separate symbols. On the 
new register, this distinction is not made. This register is not very accu¬ 
rate and, if employed at all for sampling purposes, it should be used in con¬ 
junction with and checked against the Rating List. 

BRITISH SAMPLE SURVEYS 

A discussion of sampling surveys falls conveniently into tliree parts. 
In the first place, a fairly full account will be given of oflflcial sampling 
investigations; then, some of the more important surveys undertaken 
by semi-public bodies and research organizations will be mentioned. 
Finally, an indication will be given of the work of Market Research 
and Public Opinion bodies. 


A 

OFFICIAL SAMPLE SXTRVEYS 

(1) The Social Survey 

The most important organisation in the Sampling Survey field is the 
Government Social Sur\^ey Unit. Founded in 1941, to imdertake sur¬ 
veys urgently needed in the administration of Britain's war economy, 
it is now (as part of the Central Oflice of Information) a well-estab¬ 
lished Government research organization receiving an annual Treasury^ 
grant of £60,000. The last two or three years have seen not only a con¬ 
siderable expansion in its size and its output—the Social Survey now 
employs some 250 field workers distributed throughout the country 
and a research and administrative staff of about 80—but also an im- 

* The functioxis of H. M. Treasury are considerably wider than those of the U. S. Treasury Dept. 



SAMPLING IN GREAT BRITAIN 


235 


provement in the quality of its work, including its sampling techniques. 

The position of the Survey unit in Government administration is best 
explained by indicating the procedure governing its work. A surv^ey is 
planned in response to a request from a Government department, but 
is put into the field only after Treasury approval has been given. When 
the survey is finished, a report stating its methods and results is writ¬ 
ten by the Research Officer in charge, and is submitted to the depart¬ 
ment concerned for interpretation. While this procedure has the ad¬ 
vantage that the Survey undertakes surveys covering a wide variety 
of subjects, it appears to involve certain drawbacks which are worthy 
of mention. These are essentially drawbacks arising out of the position, 
rather than the work, of the Survey. 

a. The work of the Survey is largely confined to ad hoc investigations, the 
results of which are urgently required by Government departments. 
Consequently, not enough time is left for survey's which may be of 
interest more from a sociological than from a direct administrative point 
of view. More especially, as it is difficult to experiment with methods 
in a survey the results of which have been requested and are to be used 
by a department, there is not as much methodological research as ap¬ 
pears desirable. It is to be hoped that the Social Survey will be able to 
spend more and more of its time on research into sampling techniques, 
interviewing methods, questionnaire biasses and all the other problems 
associated with social surveys. 

b. The position regarding the publication of the results of the Social 
Survey is highly unsatisfactory. The reports are, of course, sent to the 
relevant department and many of them are made available on request 
at the offices of the Survey. The Survey of Sickness results are published 
by the Registrar-General and, every now and then, the results of a 
survey find their way into the Press. It is essential that the reports be 
given a wider circulation. They might, for instance, be offered for sale 
at H.M.S.O. 

c. Thirdly, it is difficult to escape the impression that the conducting of 
the survey and the interpreting of its results are too widely separated. 
The Social Survey ought to be more than a mere collecting agency. The 
Government department usually keeps well in touch during the plan¬ 
ning and execution of a survey but there should be more consultation 
between the department and the Research Officer in charge of the par¬ 
ticular survey over the interpretation of the results and any actions 
arising directly from them. 

It is, of course, impossible in this article to refer to many of the 150 
or so sample surveys which the Social Survey has undertaken but it 
may be useful to name some of the more interesting investigations con¬ 
ducted for different departments and then to say something about the 
sampling techniques employed. The following list gives an idea of the 
scope of the Survey's work:— 



236 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

For the Board of Trade 

Numerous surveys on wartime shortages of consumer goods; on effects of 
and attitudes to clothes rationing; on the use of demobilisation coupons; 
on public knowledge of the need for an active export policy, etc. 

For the Ministry of Food 

Numerous surveys on different aspects of rationing; on the attitude to 
National wheatmeal bread and to National Milk and Cocoa, etc. 

For the Ministry of Information 

Surveys on the public attitude to various films, books, posters and other 
publicity media. 

For the Ministry of Labour, 

Surveys connected with the recruitment of women for industry; of miners 
and of agricultural workers. 

For the Ministry of Health. 

The Survey of Sickness; surveys on the public reaction to the publicity 
campaign on V.D. and diphtheria immunisation; and on the public attitude 
to the nursing profession. 

For the Ministry of War Transport. 

Surveys on workers’ transport difficulties and on road safety. 

In addition there have been regional sur^^eys in Middlesbrough and 
Willesden; a survey on the impact of air raids; a large survey designed 
to give a picture of the distribution of the population and of the family 
and household composition throughout the country, surveys on the 
demand for holidays, shopping hours, the incidence of deafness, the 
employment of old persons and many more. 

It is evident from this selection that the surveys imdertaken by the 
Social Surx’^ey not only cover a very wide field, but also that they 
concern a large variety of populations, so that sampling methods vary 
a great deal. According to Box and Thomas [7] 

“The types of population sampled in Wartime Social Survey inquiries may 
be classified roughl}" into three groups. In many inquiries, information is 
required about the whole adult civilian population of Great Britain. A 
rather larger group of problems concerns only housewives.... A third 
type of problem relates to particular groups. 

A fairly t3^ ical Social Survey sampling scheme is that used in its 

* It ‘would now be snore accurate to saythat most of the surveys oonoem the adult civilian poinila- 
tion. 



aiMPLING IN GBBAT BBITAIN 


237 


Survey of Sickness. This regular monthly survey, which began in 
1945, is an attempt to derive information about the incidence of all 
kinds of illness in the adult population. Each monthly sample of about 
3000 adults (over 16) is obtained as follows:—^The correct proportion 
of interviews is allocated (according to the current population estimate 
of the Registrar-General) to each of the twelve Civil Defence Regions. 
Within each Civil Defence Region, the administrative districts are 
divided into three groups:— 

(a) Towns lai^e enough (over 300,000) to be entitled to 30 or more 
interviews. (The Social Survey tries to arrange its samples so that 
about 30 interviews are allocated to each interviewer. This is foimd to 
be necessary from the point of view of cost and interviewer efihciency.) 

(b) Other Towns. 

(c) Rural Districts. 

In Group (a), no further division is made. The Registers of all the towns 
are sampled (such towns account for about 10 per cent of the national 
sample). In Group (b), the towns are divided into groups by population 
size in such a manner that each size group is entitled to about 30 
(or multiples of 30) interviews. The rural entitlement of interviews 
for the region is divided into units of approximately 30 interviews each. 
In both group (b) and (c), the requisite number of towns and of rural 
districts is chosen, as far as possible, at random, consideration being 
given to ensure that each of a number of geographical sub-regions 
receives approximately the correct quota of interviews. In all cases, 
the individuals for interview are taken at regular intervals from the 
Maintenance Register in the particular towns and rural areas selected. 
The over-all sampling fraction is about 1 in 10,000. 

In other surveys, such as the investigation of water-heating appli¬ 
ances in domestic dwellings or that of crockery stocks, interest is in 
households rather than individuals. In these surveys, the samples were 
selected from the Electoral Rolls and the Rating List respectively. In 
yet other surveys, such as those on Road Safety, the Demand for Holi¬ 
days and that on Shopping Hours, quota sampling was used. 

It will be seen that so much variation in sampling practice exists 
within the Social Survey that it is difficult to give a complete picture. 
The following general points may, however, be made:— 

a. la the surveys concerned with the adult population of Great Britun, 
the samples are usually between 1500 and 3000 oases. 

b. Fat most of the surveys, one or other of the lists mentioned above form 
the basis of the sampling. The Maintenance Register and the Rating 



238 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

List are the most frequently used, for the sampling of individuals and 
households respectively. In surveys connected with special populations, 
recourse is made to other records. So, for instance, in the Survey of 
Pneumokoniosis, the sample of 900 was taken from Ministry of Fuel 
records of men certified hy the Silicosis Boards; in the survey on Blood 
Transfusion, the sample of 1200 donors was taken from local records 
of the Blood Transfusion Service. 

c. The weak link in sampling schemes such as the one used in the Survey 
of Sickness lies in the selection of the actual towns and rural districts; 
as indicated above, this selection is largely governed by the desire to 
achieve complete geographical coverage of the area. But it is limited by 
considerations of cost and interviewer economy and efficiency. The 
samples are small, in terms of absolute numbers, and the attempt to 
allocate the interviews in bunches of at least 30 necessitates principles 
of selection which are not altogether satisfactor 3 ^ It is believed that 
the Social Survey is giving thought to this problem. 

d. It is satisfactory to note that the Social Survey is moving further and 
further away fiom the use of quota sampling which, at one time, was its 
chief method of sampling. It ma 3 ’’ be hoped that, ultimately, the method 
may be abandoned altogether. 

The discussion of the work of the Social Survey has, inevitably, been 
short and incomplete. There is no doubt that the Survey has done very 
useful work and also, as its officers themselves would probably agree, 
that there is plenty of room for improvement in its methods. The time 
is ripe for a complete and up-to-date statement of the Survey's work 
on the lines of the paper presented in 1944 by Box and Thomas and, 
perhaps even more urgently, for a full and critical examination of all 
the techniques emploj^ed in its surveys. 

(2) The Ministry of Food 

The ^Ministry of Food was one of the first Government Departments 
to make any considerable use of sampling surveys. In addition to surveys 
undertaken for the ^Ministry by commercial organisations, the Social 
Survey has, during the last few years, carried out a large number of 
ad hoc investigations on different aspects of the country's food situation. 
The most important of the Ministry's various survey projects, however, 
is its continuous Family Food Survey, the aims and methods of which 
will now be outlined. 

Object The Family Food Surv^ey is the chief source of information 
concerning the dietary habits of the population. Started in 1941 its 
objects are (a) to investigate ^*the nutritional value of food actually 
consumed as compared with the estimated physiological needs of the 
same fa mili es”; and (b) to collect data on the uptake of welfare foods 



SAMPLING IN GEEAT BEITAIN 


239 


and on the ejffect of food control measures, including price changes. 
A household or family is defined as “all persons for whom the house¬ 
wife caters”. 

Population. From the beginning of the survey the main sample 
has concentrated on households representative of the working class 
population of the country. In addition, similar middle class and rural 
working class samples have been taken more or less continuously for 
comparative purposes in most years. Special groups, such as miners, 
agricultural workers and old age pensioners are investigated from time 
to time. 

The Sample. The Ministry take a fresh sample of approximately 
800 households every month, so that about 10,000 households are 
covered annually. The sampling is done in two stages. 

(1) Towns, including all the great conurbations, are first chosen 
by purposive selection—^i.e. with regard to their size and the character 
of the region. 

(2) Within the selected towns, sampling is done at random from the 
Electoral Register. In practice, the procedure is to tackle one part of 
the town, or ward, at a time so that interviewing is done as economi¬ 
cally as possible. Predominantly upper class wards are excluded from 
the sample, as are those consisting mainly of industrial premises. Any 
middle-class households sampled in mixed wards are added into a 
separate middle-class sample. When lists for a whole ward are exhausted 
(the sampling fraction is 1 in 35), a new ward is started on. This con¬ 
tinues until the whole town is covered; then a new and, if possible, 
similar town is chosen for sampling. 

The Ministry of Food points out that the representativeness of the 
resultant sample is subject to two qualifications: 

a. In the first place, small towns tend to be slightly under-represented, 
mainly as a result of the inevitable immobility of the interviewers. 

b. In the second place, in the past about 20 per cent of the households sam¬ 
pled proved to be non-contacts even after three calls, while a further 
30 per cent could not or would not co-operate for different reasons. 
When all else fails, interviewers are permitted to take substitutes by 
calling at each house in turn to the right of the one originally sampled. 

This large-scale substitution is a potential source of bias and this 
must be borne in mind in interpreting the results. The fact that the 
sample has had a somewhat overweighted proportion of children may 
be attributable to this substitution. As far as average food expenditure 
per head is concerned, analysis has showm little difference between 
pre-selected and substituted households. 



240 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


The fieldwork of the survey is undertaken by an outside commercial 
agency, so that direct contact between the Ministry and informant is 
avoided. Fieldworkers call every second day during the week of investi¬ 
gation or, if necessarj", every day. 

Sampling Variation. Another point to consider is the variability of 
the population in terms of family size, sex and age composition and the 
number of meals taken at home and outside. For this reason, as the 
ilinistry points out, repeat samples taken at the same time and with 
the same regional distribution will vary slightly. Percentage coefficients 
of variation have been calculated, for all the different food expenditures, 
nutrient intakes, etc. 

Procedure and Data Obtained. Each housewife who agrees to co¬ 
operate, is given a log book to complete for the succeeding seven days. 
For each day, she has to fill in four sections as follows:— 

(1) Quantity, description and cost of each item of food bought on that 
day. 

(2) Quantity, description, source of and price paid (if any) for all home¬ 
grown food, gifts and welfare food. 

(3) Description of food served at each meal and note indicating which per¬ 
sons were present. 

(4) Number and type of meals taken out by different members of the 
household. 

In addition, the interviewer records on the cover page, details of family 
composition, including age, sex, relationship to housewife and occu¬ 
pation of all members of the household. This information makes the 
calculation of the nutritional needs of the family possible. Furthermore, 
food in the larder is weighed and measured at the beginning and end 
of the week, so that changes in stocks may be calculated. 

Results. The results, calculated for the sample every month, show: 

o. The average daily consumption per head of each food. 

6. The average daily intake per head, and average daily requirement per 
head of each nutrient. 

c. The average expenditure on and actual price paid for each food, and on 
all foods. Also the quantity and value of foods obtained from free 
sources. 

(3) The Ministry of Labour 

The Ministry of Labour has made less use of sampling than might 
perhaps have been expected. Beference has already been made to its 
1937/8 Family Budget enquiry and to Professor Hilton’s investigation. 
The other use of sampling which should be mentioned here is in the 



BAMP T.mO IN GBBAT BBITAIN 


241 


Ministry’s Anals^is by Age-group of Insured Workers. Up to July 
1948, at the time of the aimual exchange of unemployment books, 
information about the distribution of insured workers by industry, sex 
and by four age-groups (under 16,16 and 17, 18-20, 21 and over) was 
obtained. It was desired to obtain more detailed information for the 
21 and over age-group and this was done first in 1937 and regularly 
since 1942 by means of sampling. 

The sample was taken from bundles of about 100 unemplojnnent 
books tied up at the Employment Exchanges (for dispatch to another 
office). Two books were selected out of every bundle, respectively 
I and f of the way through the bimdle. The resultant 2 per cent sample 
(about 300,000) was a small one and the Ministry of Labour pointed 
out that great care needed to be taken in interpreting results. 

These samples sdelded analyses for total insured workers by sex, age 
and industry and by five-year age-groups. Up to 1947, the results are 
available only on a national basis; in that year, for the first time, a 
regional anal 3 ^is was also published. 

(4) The Ministry of National Insurance 

The analysis based on the annual exchange of all unemployment 
books (not the additional sample analysis) not only 3 delded detailed 
information about the insured working population, but also served as 
a ba^ for estimates of the total worldng population. 

With the new National Inusrance Act, which came into force on 
5 July, 1948, a different type of insurance book is used and their 
large number (25,000,000) renders a simultaneous exchange of books 
impracticable. 

The exchange of books is consequently being spread over the year 
at quarterly intervals and a careful sampling plan has had to be devised 
in order that detailed estimates for the whole working population may 
be made. 

The books have been allocated to the four quarters in a systematic 
and unbiassed manner. No details have so far been announced about 
the kind of estimates it is intended to make at the quarterly dates, 
or the way in which they will be combined to give annual estimates. 

(6) The Ministry of Works 

The Ministry of Works uses two methods for obtaining details of 
the Labour Force in the Building Industry. In the first place, a quarter¬ 
ly census of the whole industry is taken yielding very detailed data on 
the distribution of the labour force by region, occupation, type of 



242 AMEBICAX STATISTICAIi ASBOCIATIOX JOUEXAL, JUNE 1949 

work, etc. Owing to the large number of forms involved and the delay 
in getting them in, the first results are not available until 12 weeks 
after the Census date. 

The other source of information is a monthly sample (started in 
1945), for which the population consists only of the twelve main building 
trades (about f of the whole industry). Information collected from the 
sample is confined to the total mnnber of operatives employed by firms. 
Against these disadvantages of more limited cover^e and less detailed 
information of the sample enquiry must be set the great economy for 
both the Ministry and the industrj', increase in accuracy and the fact 
that the final tabulation is available three weeks after the sample 
date. 

The ^Ministry of Works requires that the maximum error in the 
over-all total of operatives should be 1 per cent and found that the 
appropriate sample size would be between 5000 and 6000 contractors. 
The sample is based on the last available census tabulations (usually 
from a census taken 5-7 months previously). The sampling fraction 
varies from 1 in 100 for firms with 0 employees, 1 in 30 for firms with 
1-5 emploj'ees to 1 in 2 for firms with 71-99 employees and a 100 per 
cent sample for all firms with 100 and more employees. The over-all 
sampling fraction is approximately 1 in 20, there being about 120,000 
firms in the 12 Trades with which this sample is concerned. 

The sampling in these investigations of the Ministry of Works is 
designed and executed with particular care and the possibility of taking 
a census less frequently and denting more detailed information from 
the samples is being investigated. 

(6) The General Register Office 

The General Register Office is the department responsible for the 
Decennial Censuses of Population and for the collection and publication 
of demographic statistics generally. In view of the scope of its work, 
its failure so far to make use of sampling to any considerable extent is 
striking. There are two uses of sampling by the G.R.O. to which refer¬ 
ence may be made: 

(1) "Classification and tabulation of multiple or secondary causes 
of death” [8]. In the classification of death by cause, when more than 
one cause is mentioned, it is necessary to select one as that to which 
the death should be classed. Information should, however, be collected 
not only on the frequency of occurrence of each cause as a primary or 
secondary cause but also on the frequency with which certain causes 
appear in conjunction. 



SAMPLING IN GREAT BRITAIN 


243 


In each of the years 1921-1930, a group of causes of death was 
selected for further investigation. Apart from the ordinary punching 
of the primary cause of death on each card, the whole cause as certified 
was written at the top of the card. A sample was then taken for each 
group of deaths attributed to a particular prunaiy cause; the sampling 
fraction var3ring from 1 in 10 for groups with a large number of deaths 
to a 100 per cent sample for groups with only small numbers. 

The secondary causes of death on all the sampled cards were then 
coded and an analysis of associated causes was made. 

(2) “Emergency Medical Service Records. ” Records are available for 
all patients who have been treated as in-patients under the E.M.S. A1 
in 6 sample of all the cards of discharged patients was taken, yielding 
about 45,000 cases for each year from the beginning of the war to the 
end of 1947. The data collected and tabulated from this survey in¬ 
cluded sex, age, civil status, branch of service or other occupation, and 
full details of the patient’s hospital record, from his admission to his 
discharge. 

It will be seen that both the above samples were samples from docu¬ 
ments and the only field sample survey with which the General Register 
Office has, in fact, been in any way associated is the Survey of Sickness 
(mentioned in Section 1) the results of which are now published in the 
General Register Office Quarterly Returns. 

Some general explanations of the apparent lack of enterprise regard¬ 
ing the use of sampling in Government departments will be offered in 
a later Section. It is, however, worth noting at this point that the 
Registrar-General is considering the use of sampling at the 1951 
Census of Population but that no conclusions have so far been reached. 

There is an unanswerable case for the use of sampling at the next 
Census to get out preliminary results more quickly. Furthermore 
there is no reason why all the final Census tabulations should be based 
on an anlysis of all the returns. Supposing that the same information 
was obtained from every member of the population (thus avoiding any 
legal difficulties), results of sufficient accuracy might be obtained on a 
number of questions by taking samples of the returns. This would save 
considerable time, labour and money and would avoid the situation 
arising out of the 1931 Census, some of the results of which it has still 
not been possible to analyse. 

It is to be hoped that the Registrar-General will decide in favour of 
sampling and that no time will be lost in undertaking the necessary 
research and other preparations. 



244 


AMBBICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


(7) The Sample Family Censtis 

The Sample Family Census, which took place at the beginning of 
1946, is an essential part of the work of the Royal Commission on 
Population. Its purpose was to obtain information about changes in 
the size of families and thus to provide data of obvious importance 
with regard to ‘‘the population problem and its bearing on housing, 
family allowances, social insurance, and other measures of social 
welfare” [9]. 

The sample of 1 in 10 of all the married women in the country 
fjdelding about 1,600,000 women) was selected from the Food OflSlce 
files. At the 1945 exchange of ration books, women had been asked 
to describe themselves as “Miss” or “Mrs.” Accordingly, after every 
10th reference page had been extracted from the file of persons aged 
18 and over, the pages of all males and of all females describing them¬ 
selves as “Miss” were discarded (apart from women who, though they 
described themselves as “Miss,” had changed their names since the last 
exchange of ration books). The remainder then, were women who had 
described themselves as “Mrs.” or who had not given the information. 
In order to preserve the randomness of the sample, all these women were 
contacted and those who were actually found to be “Miss” were then 
discarded from the sample. One more precaution was necessary before 
this could be regarded as the final sample of presently or formerly 
married women: the pages for women who had left the particular 
area were removed by checking the reference pages against the 
Maintenance Register. Great efforts were made to contact all the women 
sampled and substitutes were not permitted in any circumstances. 

Questions were asked on:—^whether at present married, vddowed 
or divorced; date of birth and of first marriage (and of the end of the 
marriage, if applicable); dates of birth of all live-born children; number 
of children imder 16 alive, and husband’s occupation. 

The information was collected by 12,000 enumerators who received 
l/4d for each completed form. It is hoped to publish the report on the 
Census, with full details of the sampling methods employed and ac¬ 
curacy achieved, sometime next year. 

(8) The KcAional Farm Survey 

The Xational Farm Survey was a development from the local farm 
surveys which were being carried out during the war to assist the work 
of the County War Agricultural Committees. The task of these Com¬ 
mittees “may be shortly stated as ensuring that each farm makes its 
maximum contribution to food production” [10]. 



8AMPUNO IN OBBA.T BRITAIN 


245 


The National Farm Survey was carried out from 1941 to 1943 and 
consisted, in the main, of information obtmned accordii^ to a standardr- 
ised pattern from the local surveys and of the returns from the 1941 
agricultural Census. The survey, in addition to aiding the local com¬ 
mittees, provided invaluable statistical material on a national and 
county basis. 

The Survey population consisted of the 300,000 holdings of 5 acres 
and over. Information was collected on the following subjects:— 

A. Type of tenure, rent and length of occupation. 

B. Economic type of occuper. 

C. Size and type of holding. 

D. Convenience of farm lay-out; situation of holding; condition of 
buildings. 

E. Nature of the soil; nature of water and electricity supply. 

F. Managerial efficiency of occupier and condition of cultivated 
land. 

It was impracticable and unnecessary to base the national and county 
totals and averages on an analysis of the records of all the 300,000 
holdings. A random sample was therefore drawn and was stratified by 
county and size of holding. The sampling fractions varied according to 
size of holding as follows: 


ffise of Holding 
Acres 

Sampling Fraction 
% 

5- 24.9 

5 

2&- 99.9 

10 

100-299.9 

25 

300-699.9 

50 

700 and over 

100 


The final sample covered about 14 per cent of all the holdings in England 
and Wales. The figures throughout the report were arrived at by multi¬ 
plying the sample results by the reciprocal of the respective sampling 
fractions. The full published report includes a detailed description of 
the sample design, the standard errors (which are trivial for national 
and very small for county data) and estimates of the gain throu^ 
(a) stratification and (b) variable samplmg fraction. 

(9) BBC Listener Research 

It is appropriate, at this stage, to say something about the work of 
the listener Research Department of The British Broadcasting Cor¬ 
poration. This department, first set up in 1936, collects information 



246 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

about the listening habits and tastes of the British public; it tries, in 
fact, to establish some kind of a “box-oflSce” substitute in the world 
of radio. The head of the department has described the two principal 
tasks of Listener Research as being:— 

(1) to find out how many people listen to each broadcast, 

(2) to ascertain listeners’ opinions of the broadcasts which they hear 

[ 11 ]. 

To accomplish these tasks, Listener Research organizes two separate 
and very different enquiries, which will now be described in a little 
detail. 

The Survey of Listening 

This continuous survey is based on a daily sample of 3000 persons 
(aged 16 and over). An equal number is interviewed in each of the six 
BBC broadcasting regions into which the country is divided, but the 
results are subsequently weighted according to the population in each 
region. A further sub-division into rural and urban areas is made within 
regions. Selection of individuals in each area is by means of quota 
sampling—according to quotas based on the sex, age and social class 
distribution of the population. The procedure employed for determining 
the social class of the informant is for the interviewer to ascertain his 
occupation and with the aid of a “common-sense assessment of the 
contact’s maimer, bearing and conversation” to classify him as working, 
lower middle or upper middle class. It may be questioned vrhether this 
procedure is reliable and, considering the relevance of social class in 
this type of survey, one might suggest the use of a more precise method. 

In the interview, the contact is asked to name the programmes he 
listened to on the previous day. The interviewer will try to aid the 
informant in remembering the broadcasts and may have to distinguish 
between “listening to” and simply “hearing” a programme. This, of 
course, is the great snag of all radio research and it is doubtful whether 
there is any satisfactory solution. 

The results of these daily interviews are issued about 7 days after 
the broadcast in the form of what is called the Listening Barometer; 
that is to say, against each programme are shown the percentages of 
the sample (in each region and in Great Britain as a whole) who listened 
to it. 

The Listening Panel 

While it is of great value to the BBC to know the number of people 
who listened to each broadcast, it is equally important to know how 



SAMPLING IN GREAT BRITAIN 


247 


a particular broadcast was liked by those who did actually listen to it. 
For this purpose, the BBC has established a panel made up of volunteers 
recruited as a result of occasional appeals. This panel consists of 3600 
members, distributed equally over the six regions. Each panel member 
receives, twice a week, a questionnaire containing questions about a 
number of broadcasts to be made during the next few days. All types 
of programmes are included and panel members are specifically asked 
to give their opinions only on those broadcasts in which they were 
anyhow interested; that is to say, duty listening is to be avoided. 
The panel members (whose average “panel life” is 18-24 months) are, 
of course, self-selected and it is questionable whether the opinions sent 
in on a particular broadcast can be regarded as fully representative of 
the opinions of all who listened to it. The summarised results of the 
questionnaires are passed on to the producers of the programmes. 

The results of Listener Research are given the widest possible circu¬ 
lation inside the BBC, but it is impossible to assess the extent to which 
they influence programme policy and planning. As Listener Research 
constitutes the main link between the BBC and its public, directors 
of sections and programme producers probably take a considerable 
interest in its results. 

The above account of the use of sampling in official surveys cannot 
be regarded as complete. It has not been possible, for instance, to say 
anything about the use of sampling in the Ministry of Home Security's 
wartime investigations of the effects of bombing; or of the sample 
enquiry which the Royal Commission on the Press is believed to be 
making; or for that matter, of the employment of sampling methods 
in the internal work of Government departments. It is believed, how¬ 
ever, that enough has been said about the most important official sample 
surveys, for which any details could be obtained, to form a basis for 
the general remarks which follow in the last sections of this paper. 

B 

SURVEYS CONDUCTED BY SEin-PUBLIC BODIES 
AND RESEARCH INSTITUTES 

An indication has already been given of the wide use of sampling 
in social surveys and there is an increasing tendency, on the part of 
Universities, Research Institutes and local authorities, to avail them¬ 
selves of sampling techniques in their surveys of social and economic 
problems. It is now proposed to describe a few of the more important 
and interesting surveys which have recently been, or are still being, 
carried out in this field by non-official, non-commercial organizations. 



248 AMEEICAX STATISTICAL ASSOCIATION JOURNAL, JUNE X949 

1. Population Investigation Committee—Survey of Maternity 

The survey on maternity [12] was undertaken in order to obtain 
data on the social and economic aspects of childbearing. In particular, 
information was sought on the availability and use made of maternity 
services in different parts of the country and social classes, and on 
present-day expenditure on childbirth. 

It was desired to collect the information through personal interviews 
with mothers “whose experience could be regarded as typical of the 
whole population of women now bearing children.” The method 
adopted was to take a sample in time/rom all local authorities—^rather 
than a sample of local authorities—and to inter\dew all women who 
were delivered in England, Wales and Scotland during the week of 
3-9 March, 1946. This procedure had the advantage of greatly 
easing and speeding up both the field work and the administration of 
the survey. 

Officials of all the 458 local authorities were approached and active 
support was received from 424 (92%) of them. It is estimated that 
16,696 births were registered during the week in question, of which 
15,130 were notified to the Survey Committee. Successful interviews 
were made regarding 13,687 (or 90.5%) of these. It is interesting to 
note that refusals were obtained in only 2% of the cases. 

As a great deal of information had to be asked for, it was decided 
to issue two separate types of questionnaires. Type dealt mainly 
with the use made of maternity services, while Type concentrated 
on the financial aspects. Certain basic questions were, of course, asked 
in both. The 424 authorities were split at random into two groups, 
one group for the T 3 q)e ‘A’ and the other for the Type ‘B’ questionnaire. 
The proportion of successful interviews was 91.5% for Type ^A^ 
and 89.3% for Type ‘B’ questionnaire. Analysis shows that, in most 
important respects, the samples of mothers making up the two surveys 
are closely comparable. 

The results of the questionnaire inquiry—^which, together with 
the methods, are very fully described in the report—are supplemented 
by detailed studies of the extent and quality of maternity services in 
selected areas. 

One aspect of particular interest remains to be noted. It was decided 
to conduct a follow-up survey in order to study 

(a) differences in morbidity of different groups of mothers and children; 

(b) the factors which influence the health and development of different 
groups of children during their first two years of life; 

(c) differences in morbidity and development of full-time and premature 
children. 



SAMPLING IN GREAT BRITAIN 


249 

It was felt to be unnecessary to apply the questionnaire to all the 
mothers in the original enquiry, as long as the absolute number in each 
social group was large enough to yield the precision required. For (a) 
and (b), the sample consisted of one out of four mothers and children 
in the “manual workers” group (chosen at random) and all those in 
the other groups. This gave a total of about 5000 cases. 

This follow-up survey took place in March 1948 and met with con¬ 
siderable success. Special efforts were made to trace those women who 
had left their original address and completed questionnaires have been 
received from about 94% of the women. 

For the survey concerning prematoe children, a different method of 
selection was employed. Each of the 800 or so premature babies in the 
original sample was matched with a full-time one of the same sex, 
social class and birth order and bom to a mother of the same age, 
living in the same region and at the same room density. A number of 
the babies subsequently died while others moved out of the original 
area, so that a good number of the pairs had to be abandoned. It has, 
however, proved possible to follow-up 640 pairs through their first two 
years of life. This unusually large-scale and carefully conducted con¬ 
trolled investigation will, in fact, provide invaluable material on the 
causes of prematurity; on possible ways of preventing deaths arising 
from it; and on the question as to the extent to which premature and 
full-time babies differ in their subsequent development. 

2. Population Investigation Committee—Inquiries into the Trend of 
Intelligence 

In 1947, the Population Investigation Committee and the Scottish 
Council for Research in Education jointly initiated an inquiry into 
changes in national intelligence in Scotland. The purpose of the 
inquiry is to test the hypothesis that the “current patterns of differen¬ 
tial fertility (which show a negative correlation between size of family 
and the measured intelligence of the children) really imply a fall in 
national intelligence.” 

The present inquiry has been conducted in Scotland, because in 
1932 a complete age-group of Scottish children had been the subject 
of an inquiry into intelligence and it seemed appropriate to apply the 
same tests, which had then been used, under similar conditions to the 
same age-^roup of children. The survey has covered some 80,000 children 
bom in 1936, in respect of whom the same group intelligence test has 
been administered and for whom an individual questionnaire was 
completed, giving details of age, sex, size of and position in family, 
school and class, etc. 



250 AMERICAN STATISTICAL ASSOCIATION JOCRNAL, JUNE 1949 

A special and more detailed questionnaire has been filled in by a 
random sample of children, consisting of all children bom on the first 
tiiree days of each month in 1936 (“36-day sample”). Furthermore, 
individual Binet tests have been administered to a further sub-sample 
consisting of all children bom on the first day of each alternate month 
(“6-day sample”). 

The field -work has been hi^y successful. The group tests were ad¬ 
ministered to 88% of the children and a greater number completed 
the questionnaire; the more detailed questioimaire was obtained from 
99% of the 36-day sample, while the Binet tests were given to 99% of 
the 6-day sample. 

Results, which are not yet available, will ^ow the change in the 
average and standard deviation of the I.Q. since 1932; and the relation¬ 
ship between 1 Q. and size of family, parental occupation and other 
factors. It is intended to follow up a sample of the children, in order to 
compare the development of children of different I.Q.’s over a number 
of years. For this purpose, the 6-day sample will be taken plus 400 
children of very high IQ. and 400 children of very low I.Q. 

3. London School of Economics Social Research Division—Survey on So¬ 
cial Class and Social Mobility 

The aim of this survey, financed by the Kufifield Trust, is to discover 
what people mean by social class, and what are the chief factors ac¬ 
counting for class differences and for the movement from one class to 
another. An attempt will be made to estimate the relative importance 
of these factors and to trace tibie changes which have taken place in 
their influence over past years. 

A start has been made with a factual survey carried out in two London 
Boroughs and a neighbouring rural area, designed to test a method of 
obtainii^ information about educational opportimity and occupations 
entered in successive generations. The possibility of extending this 
investigation to other parts of the country is under consideration. 

The aim in this survey was to interview a married person, preferably 
the wife, and the introduction to the married person was obtained by 
drawing a random sample of individuals from all those of age 16 and 
over resident in the area investigated. If the individual drawn was 
unmarried, but living with parents, the mother was to be interviewed. 
If the individual drawn was of :^e 60 or over living with married son 
or dau^ter, the young wife was to be interviewed. Persons not to be 
interviewed were: 



SAMFUNO IN 6BBAT BBITAIN 


251 


(a) Unniamed persons living apart from their parents. 

(b) Persons previously married (to avoid complications posdbly arising 
from differential treatment of step-children). 

(c) Separated or vidowed persons. 

It is expected that this research will extend over approximately five 
years. 

4. Local Social Surveys 

The early social surveys, to some of which reference has already been 
made, were primarily focused on the central problem of poverty and 
its different aspects. In the late ’30’s, the emphasis began to shift. 
Attention was being directed more and more to questions connected 
with town planning and life on housing estates as, for instance, in the 
surveys of Birmingham [13], Becontree [14] and Watling [15]. Since 
then, and especially since the end of the war, a large number of regional 
planning surveys have been initiated and published. The social survey 
has rightly become an integral part of town planning and most of the 
recent planning studies will be foimd to contain a social and economic 
survey, usually based on a sample of dwellings or housdiolds. Thus, for 
instance, the Middlesbrough Flan [16] includes a sample survey as do 
the plans of Luton [17], Worcester [18] and some of the studies made by 
the Association for Planning and Regional Reconstruction. The sam¬ 
pling methods employed are generally not of great intrinsic interest and 
it is proposed here to describe, as an example, only the survey (rf Luton, 
which is one of the best published so far. 

Report on Luton (1945) 

Luton, described as a “young, expanding industrial town” sur¬ 
rounded by rural areas, lies 20 miles from the northern fringe of 
London and has approximately 100,(X)0 inhabitants. It was felt that, 
in view of current housing problems and the present-day changes in 
education, health and other services, the Local Council should have 
up-to-date information on population, housing and the social aspects 
of public services. 

As the authors were primarily interested in population and housing, 
they would have liked to survey every occupied dwelling. This, however 
was impossible and a sample of dwelhngs was taken. Luton was divided 
into three areas. District I—areas of unfit houses; District II—^the 
older parts of Luton; District III—^the rest of the Borou^. The popu¬ 
lation density of these districts is shown by the aven^ number of 



252 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


persons per acre, which were 92, 36 and 11 respectively. The sampling 
fraction varied as follows: 


Distriet 

No of Houses 

Sampling Fraction 

No. of Houses Surveyed 

I 1 

i 890 


890 

u ' 

* 5,270 


1,054 

ra ; 

1 22,300 


2,230 

Total 

28,450 


4,174 


To obtain data for the whole Borough, a sample of 1 in 10 of aU 
dwellings in the borough was achieved by using 1 in 10 of the houses 
sampled in District I, every other one in District II and every one in 
District III. The actual addresses in each district were selected from 
the Eating List. 

The sun’ey provided data of three kinds:—a certain amount of 
information for each sampled household (e.g. no. of rooms, no. of per¬ 
sons, overcrowding); information for each individual (e.g. age, sex, 
civil status, occupation, education); and detailed fertility data relating 
to each married woman. On the basis of this full and carefully obtained 
information, the authors were able to give invaluable guidance for 
future planning. 


PUBLIC OPINION AND MARKET RESEARCH 

It is not intended in this article to give an exhaustive account of the 
sur\^eys conducted or methods employed in opinion polls and market 
research in this country. Some of the main organisations in the field 
will be mentioned and an indication will be given of any interesting 
characteristics they may possess. The justification for this somewhat 
sketchy treatment is twofold. In the first place, it is in the nature of 
the work of, at any rate, the market research bodies that the aims, 
methods and results of their surveys are made known to their clients 
rather than to the general public. Anything like an accurate accoimt of 
their activities would be very hard to give. In the second place, it is 
very much doubted whether either the methods employed by public 
opinion polls and market research bodies here or the fields of applica¬ 
tion to which they direct their attention differ sufficiently from Ameri¬ 
can practice in this field to warrant a full discussion in this article. 

(i) British Institnie of Pvblic Opinion 

The B.I.P.O., founded in 1936, is one of the international chain of 













SAMPLING IN GREAT BRITAIN 


253 


Gallup institutes. Opinion polls in this country do not, of course, 
enjoy anything like the publicity given to polls in America. Thus, the 
findings of the B.I.P.O. are published by only one newspaper, the daily 
News Chronicle. Nor could it be said that opinion polls here yet arouse 
such universal interest or such bitter controversy as in the United 
States. 

The B.I.P.O. has, in the past 12 years, conducted polls on a wide 
variety of topics and has also published fairly accurate forecasts on 
the 1945 general election and on numerous bye-elections. Apart from 
the work concerned purely with opinion polls, B.I.P.O. has, on a 
number of occasions, co-operated in, or conducted surveys for, other 
organisations. Thus, it undertook part of the war-time body-weight 
survey for the Ministry of Food and several surveys for the Board of 
Trade. The method of sampling used by B.I.P.O. is quota sampling. 

(ii) Mass Ohservation 

The surveys conducted by Mass Observation are not based on scien¬ 
tific sampling methods. Mass Observation is mentioned here because it 
has received very wide publicity and aroused considerable interest. 
Founded in 1937 it has, in a very active decade, published some 14 
books and several himdred bulletins and articles. 

Its reports are based on two different sources: field surveys and a 
national panel of respondents, neither of which appear to be based 
on a randomly selected sample. 

(iii) Market Research Bodies 

As in America, there is in England a large and increasing number 
of market research organisations. As far as can be judged, from the 
few leaflets which it has been possible to obtain, all of these organisa¬ 
tions claim that scientific sampling methods form the basis of their 
results and standard error formulae are usually quoted. Not enough is 
published to permit any evaluation of the methods employed by market 
research bodies; it can be stated, however, that the method of selection 
generally employed is quota sampling which is cheap (judged purely 
in money terms), offers great ease of administration and fieldwork 
(it by-passes the problem of caU-backs), and is, for commercial purposes, 
considered as sufdciently accurate. 

The most important organisations in the field are probably the British 
Market Research Bureau and Research Services Ltd. The former, 
among other activities, has done a number of surveys for the Board of 
Trade and the Ministry of Food. Research Services Ltd., is the new 
name of what was formerly the Research Department of the London 



254 


AMERICAN STATISTICAI* ASSOCIATION JOURNAL, JUNE 1949 


Press Exchange. It now constitutes a separate company and covers a 
wider range of activities than other market research bodies. Another 
company, Attwood (Statistics) Ltd., runs a Consumer Panel, selected 
on the basis of a random sample. 


D 

A COMPARISON BETWEEN BRITISH AND AMERICAN 
SAMPLING PRACTICE 

The preceding sections show’ that investigations on a wdde variety 
of subjects are undertaken by means of sampling, both inside Govern¬ 
ment departments and by other organisations. Nevertheless, it can 
hardly be claimed that the available sampling techniques are exploited 
to the full or that suflBicient effort is made to experiment with different 
and more refined methods. The remainder of this article is devoted to 
a discussion—based on the foregoing survey—of the general features 
distinguishing the use of sampling in Great Britain. 

It is important, in this discussion, to guard against the error of treat¬ 
ing the American and British situation as strictly comparable. The 
enormous area and widely dispersed population of America gives rise, 
in the first place, to a much greater necessity for sampling and has led 
to the development of sampling methods particularly suited to these 
circumstances. In Great Britain, the population is relatively small and 
highly concentrated. It would be a mistake to believe that American 
sampling methods could necessarily be used with the same success here, 
and much experiment and research is needed to decide w^hich refine¬ 
ments and developments can usefully be “imported” (it is interesting 
to compare a French point of view on this subject [19]). 

Further to this, it should be pointed out that probably, in this coimtry 
relatively more data about the population comes in as a by-product 
of administration and does not need to be obtained by special sun^eys. 
(Information about the Labour Force is a case in point.) This differ¬ 
ence again implies that the necessity for sampling is not usually as urgent 
as in the United States, but it also points to far greater opportunities 
in the sampling of documents and returns to obtain quick and regular 
data to aid administration. It cannot be said that anything like full 
use is being made of these opportunities. 

(a) The Attitude to Sampling in Great Britain 

It must, in the first place, be emphasised that in England sampling 
is not yet as generally imderstood or accepted as a reliable way of 
collecting information as in the United States. The American public 



SAMPLING IN GHEAT BEITAIN 


265 


was made “sample conscious” by the evident success and through the 
enormous publicity of public opinion polls in general and their past 
election forecasts in particular. The corresponding absence of such 
publicity here partly explains any lack of confidence in sampling felt 
by the general public. 

More important from the point of view of the development of sound 
sampling practice is the distrust and lack of knowledge remaining, in 
Government departments and elsewhere, among persons who are 
potential initiators of sample surveys. No doubt, now that more 
attention is paid to sampling in University statistics courses, this 
distrust and ignorance will gradually be dispelled. Yet, there is an urgent 
need for the spread of knowledge through the full publication of the 
methods and results of surveys and for a regular procedure by which 
statisticians and administrators in Government departments and else¬ 
where may avail themselves of advice on the conduct of sample surveys. 

A department, perhaps within the Central Statistical Office, should 
devote its vrhole time to the examination of past and current sample 
surveys in all fields. By scrutinising the methods used in the past and 
by giving advice on new projects, such a department could help first to 
establish and ultimately to maintain a high standard of sampling prac¬ 
tice. Further, it could act as a central Register of work being done by 
all sorts of organisations and thus prevent much duplication of re¬ 
search. 

(b) The Publication of Sample Surveys 

It is obvious that the full publication of survey methods and results 
is a considerable asset to further development. In England, unfortu¬ 
nately, the position regarding publication is very unsatisfactory. 

Some indication of the circulation of the Social Survey reports was 
given earlier; the Family Food Survey of the Ministry of Food, which 
has continued for seven years and is the main source of information 
concerning the diet of the British public, is only now receiving publi¬ 
cation; as far as can be ascertained, the sample surveys undertaken by 
the Ministry of Works, the Board of Trade, the Ministry of Labour 
and the Ministry of Home Security have not been described in any 
publication. 

Non-publication of sampling methods is only one aspect of a general, 
but possibly decreasing, tendency in Government departments to 
treat their information as confidential or, at least, not for publication. 
This is a great barrier to the development of sampling methods and the 
spread of knowledge. It is to be hoped that the valuable suggestions 
concerning this problem, made in a recent report by the National 



256 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

Institute of Economic and Social Research [20], will be followed up. 
In all fairness, it must be added that there are exceptions to this tend¬ 
ency. Thus, tiie National Farm Survey was fully described in the 
oflBicial and publicly available report. Similarly, the methods as well as 
the results of the Sample Family Census will be given in great detail 
in the forthcoming report on that survey. 

There is one more point on this question of publication. With the 
increasing use of sampling in investigations concerned with human 
populations, there appears to be a good case for an international journal 
devoted to this subject. Such a journal would not need to confine itself 
to sampling, but could range over all the methodological problems which 
arise in the planning and execution of surveys. Perhaps the United 
Nations Sub-Commission on Statistical Sampling might consider the 
question. 

(c) Methodological Research 

The work on sample surveys in this country is largely confined to 
ad hoc surv'eys or, at any rate, surveys which are important for their 
results rather than their methods. There is clearly a need for more 
systematic and coordinated experimentation and research on sampling 
techniques, on the lines of the work done, for instance, by the Bureau 
of the Census. Perhaps this task could most suitably be performed by 
a department such as that suggested in (a) above. 

(d) Area Sampling and Quasi-Random Sampling 

This is not the place to enter into the controversy regarding the 
respective merits of these two methods of sampling. It is desired 
merely to indicate why area sampling, which has been so successfully 
applied in the United States, is not used here; and whether it is likely 
to be used in the future. 

The most obvious reason for the non-use of area sampling here 
lies in the availability of the National lists, mentioned earlier. None 
of these lists is perfectly accurate, but if care is taken, they can form 
an adequate basis for random sampling at given intervals, i.e., quasi- 
random sampling. The fact that these lists are available, while maps 
suitable for area sampling are not (the maps of the Fire Service may 
be closest to what is required) has acted as a check to experimentation 
with area sampling methods. 

It is probable that even with the relatively small and concentrated 
population of Great Britain, area sampling would lead to certain 



SAMPLING IN GREAT BRITAIN 


257 


advantages. Research is needed to show whether the high initial outlay 
would be justified by increased accuracy and economy resulting from 
its use. 

(e) The Use of Purposive Selection 

It has been seen that, in Great Britain, most of the sample investi¬ 
gations concerned with human populations rely on stratified random 
sampling from the lists as their method of selection. If national estimates 
are involved, the sample may have to be spread geographically over the 
whole country. Often, cost considerations and interviewer resources 
make it desirable to get the field work reasonably concentrated and, 
in a number of surveys, the primary sampling units are chosen by pur¬ 
posive selection. Thus, in the Ministry of Food Family Diet Survey, 
the towns are chosen to be representative of Great Britain, with regard 
to their size and the character of the region. The same sort of selection 
is used in some of the Social Survey investigations. 

Selection of the primary sampling units on the basis that they are 
“representative” may easily introduce bias and it will be better to 
use the alternative procedure of sampling at random from a list of areas 
or towns. The areas would be grouped in strata in some appropriate 
way and a variable sampling fraction could be applied, perhaps on the 
lines used in the United States Sample of the Labour Force [21]. 

(f) The Use of Quota Sampling 

Within the primary sampling units, the usual procedure employed 
for the selection of the final sampling units is quasi-random sampling. 
It should, however, be noted that quota sampling is still the method 
generally used by commercial organisations, opinion research bodies 
(including Listener Research) and occasionally in Social Survey 
investigations. 

The cheapness and ease of execution of quota sampling are qualities 
which endear it to commercial organisations. But, from a statistical 
point of view, the method is unsatisfactory for three reasons. In the 
first place, the choice of units depends on human judgement and, 
therefore, may involve serious bias. In the second place, quota sam¬ 
pling does not permit any effective control over the work of the inter¬ 
viewers; and finally, it is not possible to attach an estimate of the 
sampling error to the results. The method should, therefore, not be 
employed in any surveys on which administrative decisions are to be 
based. 



258 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


(g) Further Uses of Sampling 

There are a number of fields to which sampling might well begin 
to be applied in this country and there are some indications already 
that new ground is being broken and that further uses are being con¬ 
templated. 

The Board of Trade has, at its disposal, a fairly complete list of 
industrial firms in the country. This might serve as the basis for samples 
designed to obtain information additional to that supplied by the Cen¬ 
suses of Production and Distribution. It is believed that the lists of 
firms in specified manufacturing industries are to be utilized for sample 
sun^eys on Capital Expenditure. 

Much information on family expenditure is needed and will regularly 
have to be obtained for any contemplated permanent Retail Price 
Index to take the place of the present interim Index. Sample surveys 
have been started by the Social Survey with a view to obtaining infor¬ 
mation on the national expenditure on selected items. So far, surveys 
on laundry and household textiles have been conducted and ultimately 
all types of consumer expenditure will be covered. 

The most obvious gap in sampling practice in this coimtry is in the 
field of population. It is to be hoped that the Registrar-General will 
decide to use sampling in the 1961 Census and that he will also begin 
to consider the possibility of taking regular sample surveys of the 
population to fill in the 10-year gaps between Censuses. Sampling for 
population estimates requires more refined techniques than are custom¬ 
ary in most of the surveys mentioned above. It is in this field that 
American developments have been most striking and any contem¬ 
plated use of sampling at the next Census of England and Wales should 
be based on a very careful examination of American experiences. 

The use of sampling can lead to great economy in cost, time and 
personnel and is bound to play an increasingly important role in Govern¬ 
ment administration and elsewhere. The most urgent needs in this 
country are for systematic research, in government departments, 
universities and other bodies, into sampling as well as other (and often 
more diflacult) problems connected with surveys; and for full publica¬ 
tion of the methods used in and lessons learnt from completed sample 
surveys. 

REFERENCES 

[1] Report of the Sub-CommissioiL on Statistical Sampling (United Nations 
E/Cn. 3/37) issued 21st October, 1947. 

W. E. Deming “A Brief Statement on the Uses of Sampling in Censuses of 
Population, Agriculture, Public Health and Commerce” published by the 
United Nations, Lake Success, 1948. 



SAMPLING IN GREAT BRITAIN 259 

Series of reports submitted to the 2]id meeting of the Sub-Commission at 
Geneva in September 1948 (United Nations E/Cn. 3/sub. l/various). 

[2] Bowley, A. L., “Working-class Households in Reading,” Journal of the 
Royal Statistical Society, 1913. 

[3] Hilton, John, “Enquiry by Sample: an Experiment and its Results,” Journal 
of the Royal Statistical Society, 1924. 

[4] Stephan, P. F., “History of the Uses of Modern Sampling Procedures,” 
Presented at the 25th Session of the International Statistical Institute and 
reprinted in the Journal of the American Statistical Association, Vol. 43, 
March, 1948. 

[5] Ministry of Labour, “Weekly Expenditure of Working-class Households in 
the U.K., 1937-8,” Ministry of Labour Gazette, Dec. 1940, Jan. and Feb. 
1941. 

[6] Interim Report of the Cost of Living Advisory Committee. H.M.S.O. Cmd. 
7077; and “Index of Retail Prices” Industrial Relations Handbook 1944, 
Supplement No. 2, January, 1948. 

[7] Box, K. and Thomas G., “The Wartime Social Survey,” Journal of the Royal 
Statistical Society, 1944. 

[8] Registrar-GeneraVs Decennial Supplement for England and Wales, 1931, Part 
IV (pubhshed 1947). 

[9] Article on the Sample Family Census in the Times.” January 17th, 1946. 

[10] Ministry of Agriculture, National Farm Survey of England and Wales,” 
A Summary Report. H.M.S.0.1946. 

[11] Silvey, R. J. E., “Methods of Listener Research employed by the B.B.C.,” 
Journal of the Royal Statistical Society, 1944. 

[12] Population Investigation Committee, ^Maternity in Great Britain,” Oxford 
University Press, 1948. 

[13] Bourneville Village Trust, **When We Build Again,” 1941. 

[14] Young, T., *^Becontree and Dagenham,” 1934, 

[15] Durant, Ruth, *^Watling,” 1939. 

[16] Lock, Max and others, ^The County Borough of Middlesbrough Survey and 
Plan,” 1947. 

[17] Grundy, F. and Titmuss, R. M., ^Report on Luton,” 1945. 

[18] Glaisyer, Janet and others, ^County Town,” 1946. 

[19] Thionet, P., “M4thodes Statistigues Modemes des Administrations Fidirales 
aux Etats~Unis,” Paris, 1946. 

[20] National Institute of Economic and Social Research, ^Social and Economic 
Research and Government Departments,” 1947. 

[21] Hansen, M. H. and Hurwitz, W. N., “A New Sample of the Population,” Bu¬ 
reau of the Census, 1944. 



UNEMPLOYMENT AND MIGRATION IN THE 
DEPRESSION (1930-1935)* 


Ronald FBESDiiiiAN 
AND 

Amos H. Hawley 
University of Michigan 

This is a study of the reciprocal relationship between migrar- 
tion and unemployment in Michigan during the depression 
period 1930 to 1935. Specifically, it is concerned with two ques¬ 
tions: 

(1) Do the migrants during a depression have a poor em¬ 
ployment history as compared with non-migrants of 
similar characteristics? 

(2) Do the migrants have a better employment experience 
after migration than non-migrants of similar character¬ 
istics? 

By the use of matched control groups it is found that the 
differential in unemployment rates occurs after migration, not 
before. The results are consistent with the hypothesis that in 
a depression migrants tend to be at a disadvantage in the new 
labor market to which they move. 

THE PROBLEM 

T he questions above are at the heart of the more general problem 
of the relationship between migration and economic opportunity. 
Comparisons of the pre-migration employment status of the migrants 
with similar persons who stay at home should indicate whether un- 
emplo 3 rment is the important "push” factor in migration it is frequently 
believed to be. Comparisons of the post-migration employment history 
of the migrants with those of non-migrants at the source-point should 
help to indicate whether migrants succeeded in improving their employ¬ 
ment status by making their move. Similar comparisons with the non- 
migrants at the destination point can help to indicate whether the 
migrants are at a disadvantage in competing for economic opportunities 
with the resident population. 

Most studies of these questions have failed to give convincing answers 
even for a specific time and place. Some studies have failed, because they 
have focused on "distress” migrants without considering the relative 
importance of such migrants in the total stream of migration. The 
r^ult has been a stereotype of the migrant in a role such as that of 

* This study was made possible by a gruit from the Faculty Research Funds of the Hoxaoe Bacb* 
ham. School of Graduate Studies of the Univernty of Michigan. 


260 




UNBMPLOTMBNT AND MIGRATION 


261 


the "Okie”. Other studies have failed to compare the migrants with 
simflar non-migrants at either end of the migration process, so that it 
is diflSlcult to determine whether the employment history of the migrants 
was distinctive and whether it was a product of migration or of some 
other factor. 

There is an attempt in the present study to minimize the first diffi¬ 
culty by considering a sample of migrants in the area studied without 
restriction as to their "distress” position. The second difficulty is mini¬ 
mized by comparing the migrants with carefully selected "control” 
groups of non-migrants at each end of the migration process. An im¬ 
portant methodological objective of this study is to experiment with 
the use of the matched-group control method in the population- 
migration field. The authors recognize that the data of this study are 
limited to a specific time period and to specific streams of migration. 
Many further studies are needed to permit valid generalizations. It 
is desirable that such studies should refer to migration between specific 
areas and should control as many relevant factors as possible. 

THE DATA AND METHODS 

The data for this study are from the Michigan Population and Un¬ 
employment Census of 1935.^ The distinctive feature of the schedule for 
this census is a month-by-month history of the occupation and employ¬ 
ment status of every person 16 years old or over for the period April, 
1930, to January, 1935. This history includes data on place of employ¬ 
ment, which makes it possible to identify migrants and their routes of 
movement. 

The sample for the present study consists of all those white male 
migrants to Flint or Grand Rapids from other places in Michigan, who 
were at least 25 years old at the time of migration to the cities. Males 
15 to 24 years old w^ere excluded from the study in order to deal only 
with those migrants whose eductaion was presumably completed. The 
female migrants were reserved for a later study, because their employ¬ 
ment status involves special factors. Non-white migrants, originally 
included in the study were eliminated, because they were found to 
constitute a negligible proportion of the intrarMichigan migrants to 
Flint and Grand Rapids. Flint and Grand Rapids were selected as the 
destination points to be studied, because they are very different both 

^ Michigan Census of Population and Unemjploymentt liHchigan State Emeiseney Wdfaie Hdief 
Commission: Lansing, 1937, Nos. 1-10. The sampling prooednie for this study inYolved complete 
enumeration of all cities in Midgaa between 3,000 and 40.000 in population, a 20 per cent sample in 
each city of 40,000 and over, and a oomjdete enumeration of a random selection of 20 per cent of all 
the towns and villageB (under 3,000) and of open country type rural townships. 



262 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

with respect to population history and economic base. Therefore, it 
was hoped that studies of the migrants to each place might be treated 
as separate “experiments.” Agreement of the findings for the two cities 
should give them greater validity. The study was necessarily limited 
to intra-Michigan migration, because data were not available for non- 
migrants outside of Michigan. After observing the limitations noted 
above in selection of the sample, the sample of migrants to Flint was 
found to number 360. The number in the Grand Rapids sample was 186. 

These samples are the basis for a series of studies of the relationship 
between migration status and a number of other characteristics (edu¬ 
cation, occupation, occupational mobility). This paper on the relation¬ 
ship of migration and unemployment is only one part of the larger series. 

For the study of unemployment, each migrant was “matched” with 
a “control” non-migrant at the place from which he came and another 
at the place to which he moved (either Grand Rapids or Flint). The 
characteristics used for matching were age (within 3 years), occupation 
(in terms of the major census socio-economic groups), occupational 
history, (in terms of change between socio-economic classes), education 
(within 2 years of school achievement), and marital status. For every 
characteristic, except marital status, the matching was done as of the 
date of migration. Data on marital status were available only as of 
the end of the period. Nevertheless it was used as a basis of matching, 
because it was felt that the error involved would be considerably less 
than that involved in omitting this factor as a control. In a very few 
cases migrants in the study had come from places of less than 3,000 
population which had not been included in the census sample. Since it 
was impossible to match such migrants at the actual source-point, they 
were matched with non-migrants from a similar place located near the 
actual source-point. Three of the cases of failure to match migrants are 
accounted for by the failure to find a substitute source-point. The total 
number of such successfully made substitutions is 9. 

Although the choice among “eligible” matches was not made in the 
most rigorous manner possible, it is believed that the procedm-e used 
did not introduce any systematic biases. The procedure was essentially 
to enter the schedules at a point determined by a system of random 
numbers, examine the schedules serially until a match was found, then 
to select a new starting point and to repeat the process. Several alter¬ 
native methods involving selection of all possible eligible matches and 
random selection among them were abandoned as prohibitively time- 
consuming. There does not seem to be any reason to believe that system¬ 
atic errors were introduced by the method used. 



UNSMBLOYMENT AND MIQRATION 


Although it was not possible to match all the migrants at both ends 
of the migration, proportions matched are lai^e enot^ so that the 
authors are confident that the results of the study are not affected 
by the unmatched cases. Of the 360 migrants to Flint 312, or 87 per 
cent were matched at the destination and 296, or 82 per cent, at the 
source-points. Of the 186 migrants to Grand Rapids 171, or 92 per cait 
were matched at the destination and 149, or 80 per cent at the source- 
points. This represents a considerably higher proportion of matched 
cases than are found in most comparable "matched-group” studies. 
A study of the unmatched migrants indicates that their employment 
history is not sharply different from that of the matched cases. 

Several specific problems in connection with the matching need to 
be explained. First, those migrants whose occupational clas s ification 
was “farmer” or “farm-laborer” obviously could not be matched identi¬ 
cally for occupation at the destination point. The procedure followed was 
to match farmers with members of the “proprietors, managers, and 
officials,” group and to match “farm-laborers” with “common laborers.” 
Although these are the groups closest to the farm occupational cate¬ 
gories in the census hierarchy of socio-economic classes, the matching 
on this basis is admittedly makeshift. Therefore, in the specific com¬ 
parisons of migrants and non-migrants at the destination the effect 
of including or excluding farm m^rants has been evaluated in each 
case. In any event, the matching permits a comparison of the farm 
migrants with the urban occupational group with which they are most 
frequently contrasted and even combined.^ 

Lx the study of imemployment a second problem arose concemii^ 
tiiose persons who were either unemployed or out of the labor force 
during the entire period from 1930 to date of migration. It was not 
possible to classify these persons by occupation during the pre-migration 
period. In each of these cases the match was made as if “unemploy¬ 
ment” or “out of the labor force” were the pre-migration occupation, 
r^pectively. This means that the employment status of these two 
categories of migrants was artificially perfectly matched for the pre¬ 
migration period. These cases can be compared, however, for the post¬ 
migration period for which there were no artificial restraints. During 
the pre-migration period the comparisons are valid only for those who 
were employed at some time. In those cases where either a migrant or 
a control non-migrant was found to be out of the labor force during 
part of a period, this part of the period was omitted from the comparison 

* In a number of tabulations of the 1940 XT. S. Census for cities farm operators were tabulated with 
*ptoprietorB, managers, and ofGLeials,” and farm laborers were tabulated with "laborers.* 




264 


AMBRICAN statistical association journal, JTmB 19<9 


for the pair. The periods excluded for all groups tahen together were 
relatively small. 

We matched 271 of the Flint migrants and 140 of the Grand Rapids 
mi^ants at both ends of the migration process. Comparison of tables 
based on these completelymatched (both end) cases with the tables based 
on migrants matched at only one end of the comparisons have resulted 
in no essential differences. Therefore, in this paper the tables and discus¬ 
sions are based on the lai^er number of cases, including one-end matches. 

On the average the m^ants made their moves about 37 months after 
the beginning of the period, so the average pre-migration period is 
about 3 years. One important limitation of the data for this study needs 
to be mentioned. Data are available only for “resultant” migrants, that 
is those who moved to Flint or Grand Rapids between 1930 and 1935 
and were still there in 1935. Data for the migrants who moved again 
away from Flint or Grand Rapids before 1935 are not available. 

THB FINDINGS 

In the pre-migration period, the migrants to Flint or Grand Rapids 
had a somewhat higher rate of unemployment* than the non-migrant 
control group at the somrce-points. However, in neither case was the 
imemployment rate of the migrants or the differential between migrants 
and non-migrants large enough to justify considering the personal 
experience of unemployment as the important causal factor in mi¬ 
gration. 

The data on pre-migration employment status are found in Table 1 
for the migrants and their source matches. In comparing migrants 
and non-migrants during the pre-migration period those unemployed 
during the entire pre-migration period should be excluded, since these 
migrants and non-migrants are artificially equated on unemployment. 
Excluding this group, 18 per cent of the F^lint migrants and 13 per cent 
of the non-migrant source matches were imemployed in the pre¬ 
migration period. For Grand Rapids the corresponding figures are 13 
per cent and 11 per cent. Elven if those totally unemployed during the 
pre-migration period are included in these rates only 24 per cent of 
the Flint migrants and 21 per cent of the m^rants to Grand Rapids were 
unemployed in the pre-migration period. The migrants do not appear 
to be a group marked by an unusually high rate of imemployment for 
the pmod in question. In this and other comparisons to follow, differ- 

* The *anemployiuent rate' iised in this study refers to the percentase of persons in any group 
who were unemployed at any time during the period in question. 



UNEMPLOYMENT AND MIGBATION 


265 


ences will appear as between unemployment rates for Flint and Grand 
Rapids for each type of migrant or non-migrant. Sucb differences are 
to be expected on the basis of the different economic structures of the 
two cities. However, for the two cities the differences between the mi¬ 
grant and non-migrant unemployment rates are in the same direction 
in almost every comparison. Variations in the absolute rates of im- 
employment as between the two cities are not relevant to the problems 
being investigated. 

TABLE 1 

FEBCENTAOE DISTRtBXTTION OF EMPLOYMENT STATUS OF MIOKANTS AND 
NON-MIGBANTS MATCHED AT SOUBCE, FOB PBE-MIQBATION PEBIOD 
FLINT AND GBAND BAPIDS 


Employment Status 
in 

Pre-migration Period 

Migrant Status and 1935 Hesidence 

Flint 

Grand Rapids 

Migrants 

Non-migrants 

Migrants 

Non-migrants 

Total percentage: 


100.0 

100.0 

100.0 

In labor force: 

99.0 

99.0 

99.3 

99.8 

Always employed 


81.1 

85.2 

87.2 

Always unemployed 

5.4 

5.4 

0.7 

0.7 

Sometimes unemployed 

17.6 

12.5 

13.4 

11.4 

Out of labor force 

1.0 

1.0 

0.7 

0.7 

Number 

296 

296 

149 

149 


A check on the relative numerical importance of the group of mi¬ 
grants unemployed throughout the pre-m%ration period was provided 
by the selection of "random” samples of non-migrants at source and 
destination points for each group of migrants. The only restriction 
on selection was that the non-migrants should be at least 25, male, and 
white. Each migrant group thus had a random non-migrant "compari¬ 
son” group. These non-migrants were paired at random with migrants 
to determine a date for separation of the "pre-migration” and “post¬ 
migration” periods. At both ends of the migration process and in both 
cities the number of migrants unemployed thi'oughout the pre-migration 
period was approximately equal to the number so unemployed among 
the random comparison groups. Whether considered on an absolute 
basis or relative to non-migrant comparison groups those unemployed 
throughout the pre-migration period were not an important group. 

It is mteresting to not that in the pre-migration period the employ¬ 
ment status of the migrants did not compare unfavorably with that of 













266 


AMEBICAN STAHSTICAIj ASSOCIATION JOITBNAL, JUNE 1919 


non-migrant controls at the destination points. As Table 2 indicates, 
there was less than 1 percentage point difference in the unemployment 
rates of migrants to Mint and the non-migrant controls in Mint; 19.9 
■pCT cent of the migrants and 20.5 per cent of the non-migrants were 
unemployed in this comparison. The unemployment rate among 
migrants to Grand Rapids was even less than tihat of non-migrants 
in Grand Rapids in the pre-migration period. The unemployment 
rates were 15.2 and 21.1 for migrants and non-migrants respectively. 

table 3 

PERCENTAGE DISTRIBUTION OF EMPLOYMENT STATUS OF MIGRANTS AND 
NON-MIQBANTS MATCHED AT DESTINATION FOB PRE-MIGRATION PERIOD 
FLINT AND GRAND RAPIDS 


Employment Status 
in 

Pre-migration Period 

Allgrant Status and 1935 Besidenoe 

Flint 

Grand Rapids 

Migrants 

Non-migrants 

Migrants 

Non-migrants 

Total percentage 

100.0 

100.0 

100.0 

100.0 

In labor force: 

98.4 

98.4 

98.9 

08.9 

Always employed 

71.8 

71.2 

81.9 

76.0 

Always unemployed 

6.7 

6.7 

1.8 

1.8 

Sometimes unemployed 

19.9 

20.5 

15.2 

21.1 

Out of labor force 

1.6 

1.6 

1.1 

1.1 

Number 

312 

312 

171 

171 


The results which have been cited for liie pre-migration period were 
foimd to be substantially unaltered when separate tabulations were 
made excluding the migrants with a farm background. 

The peak unemployment rate for migrants occurred after rather than 
before the migration. While the frequency of unemployment among 
migrants rose sharply in the post-migration period, the unemploy¬ 
ment rate for the non-migrant controls showed only a moderate in¬ 
crease at either end of the movement. This is demonstrated by the data 
in Tables 1-4 to be true whether or not the totally unemployed are 
included in the pre-migration unemployed group. For example, while 
the immnplo 3 mient rate for migrants to Flint increased from 24 to 
49 per cent, that for the non-migrant source controls increased from 
18 to 24 per cent. While the percentage of increase in unemployment 
rate was no more than 36 per cent for any non-migrant group, it was 
no less than 82 per cent for any migrant group. In every case the 
increase in unemployment rates for the migrant group was significantly 
















UNEMPLOYMENT AND MIGRATION 267 

greater^ than the corresponding increase for its non-migrant control— 
whether source or destination. 

TABLE 3 

PERCENTAGE DISTRIBUTION OP EMPLOYMENT STATUS OF MIGRANTS AND 
NON-MIGRANTS MATCHED AT DESTINATION FOR POST-MIGRATION PERIOD 
FLINT AND GRAND RAPIDS 


Employment Status 
in Post-migration 

Period 

Migrant Status and 1935 Residence 

Hint 

Grand Rapids 

Migrant 

Non-migrant 

Migrant 

Non-migrant 

Total percentage 

100.0 

100.0 

100.0 

100.0 

In labor force: 

96.8 

97.8 

94.1 

98.2 

Employed 

48.4 

71.2 

57.9 

68.4 

Unemployed 

48.4 

26.6 

36.2 

29.8 

Out of labor force 

3.2 

2.2 

5.9 

1.8 

Number 

312 

312 

171 

171 


TABLE 4 

PERCENTAGE DISTRIBUTION OF EMPLOYMENT STATUS OP MIGRANTS AND 
NON-MIGRANTS MATCHED AT SOURCE FOR POST-MIGRATION PERIOD 
FLINT AND GRAND RAPIDS 


Migrant Status and 1935 Raddence 


jSimpioymenii smius 
in Post-migration 

Period 

Flint 

Grand Rapids 

Migrant 

Non-migrant 

Migrant 

Non-migrant 

Total percentage 

100,0 

100.0 

100.0 

100.0 

In labor force: 

97,6 

98.6 

96.0 

97.4 

Employed 

48.6 

74.3 

58.4 

82.6 

Unemployed 

49.0 

24.3 

37.6 

14.8 

Out of labor force 

2.4 

1.4 

4.0 

2.6 

Number 

296 

296 

149 

149 


The relatively high post-miration unemployment rate was not 
restricted to a particular occupational group, although the migrants 
with a farm background had the highest unemployment rates. Tables 
5 and 6 show the imemploym^t status of each migrant and non- 
mirant control group separately for 3 major occupational groupings: 


4 The differences were idgnificant at the .01 level in every case but one. The difference for the 
Grand Rapids migrant and non-migrant destination groups vras significant at the .05 level. The differ¬ 
ences were significant at these levels whether or not the ‘totally” unemployed were considered in the 
pre-migration period. 















268 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE ig49 


s 

I 

O 

s 


si 

P a 


4 


§1 

P ft 


«•! 


H 

iS-a 


ft 


P ft 


■^1 


CO 

100004 I 


00>-4fH 

SSfe' 


>o 

_>o 


COCOCO 

Hi' 


lo^co I 


o oooo 

d dddd 
o oooo 

iH rHfHiHiH 


t^lOcD 

^ '^deo I 


eo ooiO’^ 

s sss' 


o oooo 


O 00*0 


O OOlO 

•dd I 


r:5S' 


q qqqq 
d dddd 









PERCENTAGE DISTRIBUTION OF POST-MIGRATION EMPLOYMENT STATUS, BY EDUCATION, MIGRANTS AND 
NON-MIQRANTB MATCHED AT SOURCE, FLINT AND GRAND RAPIDS, MICHIGAN 


UNEMPLOYMENT AND MIGRATION 


269 



.1 


I 

I 


I 


1 

*o 

I 



1 Exdusive of individuals out of labor force during entire post-migration period. 











270 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


white collar workers (professional workers, proprietors and managers, 
officials, and clerical workers), blue collar workers (skilled, semi¬ 
skilled, and unskiQed workers, and servants), farm workers (farmers 
and farm laborers). In each of these occupational categories the post¬ 
migration imemplojrment rate of the migrants to Flint or Grand Rapids 
was higher than that for comparable non-migrant control groups at 
the source or destination. It is clear from these data that the farm 
migrants had a much higher rate of unemplo 3 nnent than any other 
group whether migrant or non-migrant. 

Similarly, the relatively high imemployment rate of the migrants is 
f oimd at each of three educational levels. In Tables 7 and 8 the employ¬ 
ment status of each migrant and non-migrant group is sho^\Ti separately 
for those with grade school, high school, and college education, respec¬ 
tively. The only comparison in which the migrants do not have the 
higher rate of unemployment is that between migrants to Grand Rapids 
and non-migrants to Grand Rapids for those with some college educa¬ 
tion. The numbers involved in this comparison are very small. 

The sizes of the sub-samples of the occupational groups and edu¬ 
cational groups on which the percentages for Tables 5-8 are based are 
foimd in Table 9. 


TABLE 9 


NUMBER IN EACH SAMPLE, BY OCCUPATIONAL CLASS AND EDUCATIONAL 
LEVEL, FLINT AND GRAND RAPIDS 


Occupational Class and 
Educational Leyel 

Flint 

Grand Rapids 

Sample Matdied 
at Source 

Sample Matched 
at Destination 

Sample Mat<^ed 
at Source 

Sample Matdbed 
at Destination 

Occupational class: 





Total 

296 

312 

149 

171 

White collar 

88 

94 

60 

69 

Blue collar 

106 

113 

48 

56 

Farm 

83 

79 

39 

41 

Others 

19 

26 

2 

5 

Educational level: 





Total 

296 

312 

149 

171 

Grade school 

166 

168 

86 

98 

High school 

106 

117 

43 

49 

College 

24 

27 

20 

24 


It is recognized that for these various post-migration comparisons 
it would have been desirable that the matching include control on pre¬ 
migration employment status. However, it can be shown that this 








UNEMPLOYMENT AND MIGEATION 


271 


matching defect does not seriously affect the findings. In the first place, 
the difference in increase in unemployment rates (not just the absolute 
post-migration differences in imemployment) as between migrant 
and non-migrant is significantly greater for migrants in every compari¬ 
son. Secondly, separate post-migration imemployment rates were 
computed for those migrants imemployed and those never unemployed 
in the pre-migration period. In every case both the previously employed 
and previously unemployed migrant groups had higher post-migration 
imemplo 3 mient rates than the non-migrant controls. The difference in 
unemployment rates as between migrants with a pre-migration history 
of unemployment and those with none was relatively small. 

In this connection it is interesting that the correlation between pre- 
migration and post-migration unemployment was considerably greater 
for the non-migrants than for their migrant counterparts. Among the 
migrants the rate of post-migration unemployment was high both for 
those who had and those who did not have pre-migration unemploy¬ 
ment. On the other hand among the non-migrants the rate of post¬ 
migration unemplo 3 rment was high for those with pre-migration un¬ 
employment but very low for those without it. Apparently, for migrants 
the fact of migration was more important than previous employment 
status in determining employment status in their new home. 

Consideration of the length of unemployment for the unemployed 
in each group of migrants and non-migrants indicates that the higher 
unemployment rate of the migrants is not a consequence of short 
unemployment periods distributed among many persons. Among those 
unemployed after migration, the average length of imemployment was 
greater for migrants than for non-migrants in every comparison but 
one, and in that case the average length of unemplo 3 rment was the 
same for each group. This is in part an answer to the idea that the 
migrant unemplo 3 nnent was of a short-run “frictional’’ character. 

Whatever the effects on the sending and receiving communities, 
migration appears to have offered no effective solution to unemploy¬ 
ment for the migrants, themselves. Quite the contrary, the evidence 
seems to support the thesis that migration resulted in a deterioration 
of their employment status. At least for the short-run periods con¬ 
sidered in this study, if there is any line of causation between migration 
and unemployment it appears to run from migration to unemploy¬ 
ment rather than the reverse. 

Evidence for the thesis that unemployment “causes” migration is 
sometimes presented in the form of data showing that over a given 
period of time the unemployment rate was highest for those who 



272 AMEEICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

migrated during the period. Thus, it may be said that 60 per cent of 
the source-matched migrants were unemployed sometime between 
1930 and 1935, as compared with only 27 per cent of matched-source 
non-migrants. Such evidence is really not relevant to the hypothesis, 
since it fails to indicate whether the unemployment preceded or fol- 
lovred the migration. Although it is true that 60 per cent of the Flint 
migrants® were unemployed at some time during the 5 year period, 
only 23 per cent of the migrants as compared with 18 per cent of the 
source non-migrants were unemployed prior to migration. The large 
differential in unemployment rates is after migration, not before. 

The results of the present study are consistent with the hypothesis 
that in a depression migrants tend to be at a disadvantage in the new 
labor market to which they move. This may result from the fact that 
the migrants lack the specific skills demanded in the local labor market. 
It may be that, although they have the needed skills, the migrants are 
at a disadvantage in their poorer knowledge of how to get and hold a 
job in the local situation, or it may be that in a depression employers 
are under pressure to favor the local labor supply. 

The methodological aspects of this study are equally important 
with the substantive findings. At least in this instance, the use of 
matched group control techniques proved to be feasible in a migration 
study involving a relatively small group. The technique furnishes 
factor-control frequently lacking in this kind of study. Further ex¬ 
perimentation with the technique appears to be warranted. 

* Sixty per cent of the source-matched migrants and 61 per cent of the destination-znatdlied mi¬ 
grants. 



MINIMUM X2 AND MAXIMUM LIKELIHOOD SOLUTION 
IN TERMS OF A LINEAR TRANSFORM, WITH 
PARTICULAR REFERENCE TO BIO-ASSAY 


Joseph Bebeson, M.D. 

Division of Biometry and Medical Statistics, Mayo Clinic, 
Rochester, Minnesota 


An old and commonly used device for fitting a curve 
y^F(z, a, 0) is to find a function of y which is linearly related 
to X, or a function of x which is linearly related to y, or func¬ 
tions of y and x which are linearly related to each other. The 
linearly related functions are plotted against one another, and 
a line is fitted to the points, usually by eye, sometimes by 
^east squares” in terms of the transforms. Texts dealing with 
empirical curve-fitting characteristically use this scheme [6]. 
Before Bliss and Fisher [2], no attempt was made to adjust the 
transforms systematically in order to achieve a fit that ful¬ 
filled defined criteria in terms of the original measures y and x, 
nor was it known that such adjustments were possible. These 
authors presented for the bio-assay experiment a method in 
terms of probits, which as shown by Garwood, [5] accom¬ 
plished a maximum likelihood estimate of the integrated nor¬ 
mal curve, when the observations were distributed binomially. 
Following the procedures used by Bliss and Fisher for the 
integrated normal curve, I [1] formulated the similar adjust¬ 
ments applicable to "logits” for a maximum likelihood solution 
of the logistic function, but did not advocate using this solu¬ 
tion. Finney [4] has recently presented the adjustments for a 
maximum likelihood solution of several functions of interest in 
bio-assay. So far as I know the similar method for a solution 
fulfilling the criteria of minimum X‘ rather than maximum 
likelihood has not been presented. In the following paper this 
is given for several functions in common statistical use, and at 
the same time a r4sum4 is given of the maximum likelihood 
adjustments for the same functions. 


W B ARB BBAUNG with a measure Q (P=l—Q) which is a function 
of X and two parameters a, P, the estimates of these parameters 
a, b, and Y, a linear transform of Q. 


Qi = a, P) = F{Yi) (1) 

F< = a + Pxi (2) 

qi = F{xi, a, b) = FiSi) (3) 

= o + bxi. (4) 


273 



274 


AMERICAN STAMSTICAIi ASSOCIATION JOURNAL, JUNE 1940 


Let 


z 


il. 


then — = z, — = zx. 
da db 


If the observation (?< is distributed normally and independently with tr 
equal at all values of x, then the estimates a and b fulfilling the criteria 
of either TnaTH-rmiTn likelihood or TwiT'iTmiTn X* are identical, and are 
given by the normal equations 

Z nMQi - $<) = E - $<) = o (5*) 

z - $<) = z - h) = 0 - (6) 

If gi is distributed binomiaUy with <rf—{PiQi)/ni, the normal equa¬ 
tions for maximum likelihood are given by 



. X ^ii ^ “^i ^ f . s. 

g<) — = Z rT2<(«< “ ?••) 

da Pidi 



= Z = 0 

(7) 

ViH 

W ^ - S -«.) 

36 Pigi 



d 

II 

1 

H 

w 

II 

(8) 


For X® we have 


Mi 

and to minimize we set the first derivatives with respect to a and b 
equal to zero, 3 delding 


ax® 

da 


^ ni ddi 

«< 


- S.) 


ddi 

= 0 


n< 


(p»2»r 

= Zw4"'(gi- =0 


(9) 


( 10 ) 


* The primes on the 10*8 ate to distisgniah between the different wei^ts; th^ do not r^resent 
snccessive differential ooefSdents. 




mmTMTTM AND MAXIMUM LIKELIHOOD 

and siluilarly for h 


275 


^ (s< - = Z) to*”'*<(?< - $0 = 0. (11) 

(pi9ir 

The condition for the minimum X® (9) differs from the condition (7) 
of TnB.-)riTimiTn likelihood only in the second term of (9). This term will 
vanish as (a) $<=^.5, (b) qf^^i so that for experiments such as bio¬ 
assay where observations are in the nei^borhood of the L.D. 50, or 
where the observations are close to the curve the two solutions may be 
expected not to differ very much.® 

It is seen that all the normal equations are of the form 

S «(?-§)= 0 

or 

2 W3 * E 

It is interesting to note therefore that both minimum X® and maxi¬ 
mum likelihoodi, for normal or binomial variation, impose the simple 
requirement that the weighted average of the estimates be equal to the 
weighted average of the observations, the only difference among them 
being the value of tbe weights w. 

For the situations usually considered in bio-assay work, the normal 
equations cannot be solved directly because tbe functions usually 
employed are not linear in the parameters, and where q is assumed to 
vary binomially there is the additional reason that the weights are a 
fxmction of the $’s, the values to be estimated. The equations may be 
solved by a procedure of succesave approximations by making the 
following substitutions. Let ^<.1 be a preliminary estimate of the 

corresponding value of and Zi,i the corresponding value of h, and for 
brevity write for §j.i, for 5><.i and Zi for 2 ,,i. 

qt — s (qt — gi) + (§1 — ^,). 

(^~$<) is replaced by the first term in a Taylor’s expansion 

^ It is sometizaes stated that it has been proved that the mmimiun X* and maximum likelihood 
solutions converge to the same solution as From the foregoing it seems to me that this is not 

necessarily so. It will be so if as but this is not always the case. 'When tbe function we 

are fitting is the "true” curve, theoretically with probability approaching las nZ±eo, but not 

if we are fitting a function that is different from the "true” function, as is usually the case in statistical 
curve fitting. The same is true if we are fitting a line by least squares minimising the squared residual of 
tf, to observations of a random sample from a bivariate normal distribution, when there is an error in 
the observation of x* 




TABLE 1 


276 


AMEEICA3T STATISTICAIi ASSOCU.TION JOUSNAI., JtTNB 1949 



* The eign of the *logit’' as given here is the negative of that as defined previously [1]. 



277 


MINIMXm AND MAXIMUM LIKBLIHOOD 

(§1 - 5 .) = eiiSi - Si) 

(?< ~ §i) “ (ffi ~ ?l) + — j),) 

= h(Si’ - i?.). (12) 

The value of 5>/=(2<—?i)/(2i)+5>i is called the “workmg value.” 
If we replace (ffi—$<) as in (12) in any of the normal equations previ¬ 
ously given, and, where in any of those equations Si appears, substitute 
the preliminary approximation Si, since S is linear in a and h, the 
equations may be solved by the usual procedures as used for fitting a 


tabib 2 

HT OF lOOISTIC FUNCTION* 


Iteration 

MiniTWHTn X* 

Maximum likelihood 

a 

b 

X* 

L+SOt 

a 

b 

X* 

i;+30t 



2.515 

3.9480 

0.7988 

-2.972 

2.515 

3.9480 

0.7988 

EB 

-3.156 

2.473 

2.8404 

1.0278 

-3.290 

2.581 

2.9943 

1.0644 

2 

-3.140 

2.467 

2.8390 

1.0242 

-3.324 

2.602 

3.0658 

1.0657 

3 j 

-3.143 

2.469 

2.8389 

1.0255 

-3.325 

2.602 

3.0666 

1.0667 

4 

-3.143 

2.469 

2.8389 

1.0252 

-3.325 

2.602 

3.0666 

1.0657 

Approximate minimum X^, noniterative method 






-3.126 

2.444 

2.8501 

1.0074 






* Example from Einney {3]. To simplify the calculations x, the log dose, as &yen by flzmey was 
adjusted to make x ^zero for the lowest dose, and since the doses proceeded in multiples of 2, it was 
divided by log 2. The iterations were continued until the absolute difference between successiYe values 
of both a and b was less than 0.0005. 

fL—Sr log rj) log g«, where ih is the number exposed at dose x;, p{ is the fraction 

affected at xi, ri ^pitu and pi is the solution value of pi(ai —Pf)* 

t Solution using n alone as weight. 


straight line by least squares, using as weight wt=W{'Sj, w<=»w/'zi or 
lOi=w/"zi as the case may be. With a solution obtained in this manner, 
the procedure outlined is repeated, using for the evaluation of the new 
weights and working values, the values of $ obtained in the present 
solution, and by repeating this, the respective minimum X® or maxi¬ 
mum likelihood solution is approached as a limit. 

In Table 1 are given for some functions commonly used in statistical 
practice the working values and weights in the case of normal and bi- 



























































278 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


nomial variation, for the Tnini-miim X® and maximum likelihood solu¬ 
tions. In Table 2 are given the results of fitting a logistic function to 
some data used by Finney [3] by the methods outlined for (1) mini¬ 
mum X^; (2) TTna.7cimiiTn likelihood; (3) noniterative approximate min¬ 
imum solution previously suggested by me [1]. 

REFERENCES 

[1] Berkson, J., "Application of the Logistic Function to Bio-Assay” (1944). 

Journal of the American Statistical Association^ 39, 357. 

[21 Bliss, C. I., with an Appendix by Fisher, R. A., “The Calculation of the Dos¬ 
age Mortality Curve” (1935). Annals of Applied Biology, 22, 134. 

[3] Finney, D. J., “Probit Analysis: A Statistical Treatment of the Sigmoid Re¬ 
sponse Curve.” Cambridge University Press, 1947, p. 201. 

[4] Finney, D. J., “The Principles of Biological Assay” (1947). Journal of the 
Royal Statistical Society, (Suppl,) 9, 46. 

[5] Garwood, F., “The Application of Maximum Likelihood to Dosage Mortality 
Curves” (1940). Biometrika, 32, 46. 

[6] Lipka, Joseph, “Graphical and Mechanical Computation.” Ed. 1, New 
York, John WOey & Sons, Inc., 1918, 264 pp. 



SOME INADEQUACIES OP THE FEDERAL CENSUSES 
OF AGRICULTURE* 


Haymonb J. Jessen 

Bureau of Agricultural Economics^ and Statistical Laboratory^ 
Iowa State College 


INTRODUCTION 


A NUMBER OF criticisms of the federal censuses of agriculture have 
been made by users of census data. The primary purpose of this 
paper is to bring together for discussion some of these criticisms, and to 
propose modifications designed to make the censuses more adequate 
and useful (i) as sources of general statistics on agriculture and (ii) 
as sources of information for economic analyses. A secondary purpose 
is to consider some questions of survey methods for obtaining agricul¬ 
tural information—^whether the survey be a census or a sample. The 
results of a recent statistical survey of the United States^ are used as 
an aid in determining the general order of magnitude of some of the 
points under consideration. 

The definition of a farm used by the Bureau of the Census appears to 
be inadequate for the purposes of agricultural economists on two 
coimts: the exclusion of certain agricultural producing units because of 
(i) the scale of operations and (ii) the type of enterprise. In this connec¬ 
tion it is estimated from the survey that the 1945 Census of Agriculture 
excluded, by the definition of a farm adopted, a group of about 5,200,- 
000 units whose scale of agricultural operations was too small to qualify 
them as “census farms” but which contributed an estimated $446,- 
000,000, or about 1.8%, to total agricultural production in 1946. This 
in itself may not be important, but where Jzinds of agricultural pro¬ 
duction are imder examination it may be of practical significance. 

In addition to the omissions from the census of units that do not meet 
the definition of a farm, there is some evidence that the census may 
have failed to enumerate some farms that do meet the definition. This 
is true particularly for comparatively small farming operations. The 
apparent deficiencies in the coverage of farms by the census may have 
arisen because of an incompleteness of coverage by the census inves- 


* Jounifll paper No. J-1635 of the Iowa Agricultural Experiment Station, Ames, Iowa, Project 1005, 
in cooperation with the Bureau of Agricultural Economics, XT. S. Department of Agriculture. The views 
and conclusions expressed in this paper are entirely those of the author and do not necessarily in any 
way reflect the views of the Bureau of Agricultural Economics. 

^ A national survey of the household refrigeration market conducted by the Statistical Laboratory 
in January, 1947. This survey was not designed for this purpose but it does have some information 
whidli may be of interest here. 


279 




280 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

tigator, or because of differences in the interpretation of the value of 
products of certain operations by the census investigator and by the 
investigators in the refrigeration survey. 

In connection with the problem of choice of area smaller than the 
county for presentation of data, it is su^ested that a "natural” area 
such as a valley or some such drainage area be adopted by the Bureau 
of the Census, rather than the present unit which is mainly political 
and indeed subject to change from time to time. This would provide 
statistics on a more useful and permanent basis almost throughout the 
U.S., and particularly in the west where political divisions are of es¬ 
pecially small agricultural utility. 

General discussion of the problem. The data from federal government 
censuses are used by a great many persons and for a great many 
purposes. For the present discussion criticism will be considered mainly 
from the standpoint of its use for research in agricultural economics. 
Workers in this field frequently make use of the total farm count, area 
under various crops, inventories of a number of farm quantities, char¬ 
acteristics of the farm dwelling, etc. Agricultural statisticians, such as 
those of agricultural Estimates, Bureau of Agricultural Economics, 
use the data as a “bench mark” on which to base a laige body of cur¬ 
rent estimates on acreages under different crops, crop yields, livestock 
numbers and production, etc. 

Three classes of users of census data in agricultural economics will be 
distinguished: (1) those who are interested in the number and charac¬ 
teristics of “farms” as firms, (2) those who are interested in the number 
and eharacteristics of “farmers” and their families and (3) those who 
are interested in statistical aggregates—^the total production of rice, 
for example. (There are of course other classes such as those interested 
in land owners, for example, but only these three classes will be dealt 
with here.) The first group requires data to be given in the form of 
frequency tables in which the farm is tiie unit (examples: Number of 
farms by size in acres, and by value of output). The second group re¬ 
quires the data also in frequency tables but with the farmer (operator) 
as the unit (example: income of farm operators by age group). The 
third group is generally not concerned with the unit by which an item 
is collected or tabulated but merely the unit of measurement (bushel, 
head, etc.). It is possible, however, that there is a demand on the part 
of tiffs group for such frequency distributions as acreage of harvested 
com by size of fidd, yield of apples by age of tree, etc. 

The confiict between farm and farmer as a unit of enum^ation and 



INADEQUACIES OP AGRICULTUBAL CENSUS 


281 


tabulation (class 1 versus class 2) is of real importance because many 
farms are “operated” by more than one farmer—^for example, partner¬ 
ships. 

In attempting to satisfy the demands of all classes of users, it was of 
course necessary for the Bureau of the Censxis to make some com¬ 
promises which, in part at least, gave rise to the inadequacies to be 
discussed here. Consideration will be given to those inadequacies aris¬ 
ing from: 

(1) definition of farm 

(2) enumeration 

(3) definition of farmer 

(4) definition of agriculture 

(6) unit for presentation of data for small areas. 

Inadequate definition of farm. In order to discuss properly the impli¬ 
cations of the “farm” as regarded by the Bureau of the Census, it may 
be appropriate to quote in full the definition given the enumerators in 
the 1945 census of agriculture. This definition follows. 

A farm, for Census purposes, is all the land on which some agricultural 
operations are performed by one person, either by his own labor alone or 
with the assistance of members of his household, or hired employees. The 
land operated by a partnership is likewise considered a farm. A "farm^ may 
consist of a single tract of land, or a number of separate tracts, and the 
several tracts may be held under different tenures, as when one tract is 
owned by the farmer and another tract is rented by him. When a land- 
owner has one or more tenants, renters, croppers, or managers, the land 
operated by each is considered a farm. Thus, on a plantation the land oper¬ 
ated by each cropper, renter, or tenant should be reported as a separate 
farm, and the land oi)erated by the owner or manager by means of wage 
hands should likewise be reported as a separate farm. 

Include dry-lot or barn dairies, nurseries, greenhouses, hatcheries, fur 
farms, mushroom cellars, apiaries, cranberry bogs, etc. 

Do not include “fish farms”, “fish hatcheries”, “oyster farms”, and “frog 
farms”. Do not report as a farm any tract of land less than 3 acres, unless its 
agricultural products in 1944 u)ere valued at $250 or more. 

Farming, or agricultural operations, consists of the production of crops or 
plants, vines, and trees (excluding forestry operations) or of the keeping, 
grazing, or feeding of livestock for animal products (including forestry 
operations) or of the keeping, grazing, or feeding of livestock for animcd 
products (including serums), animal increase, or value increase. Livestock, 
as here used, includes poultry of aU kinds, rabbits, bees, and fur-bearing 
animals in captivity—^in addition to mules, asses, burrows, horses, cattle, 
sheep, goats, and hogs. Frequently certain operations are not generally 
recognized as farming. This is especially true where no crops are grown or 
where the establishments are not commonly considered as farms. 



282 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE ig49 


Following is a partial list of types of specialized agriculture and of opera¬ 
tions not generally recognized as farms or farming, for which returns on the 
Farm and Ranch Schedule are required, provided the area is 3 acres or more 
or, if less than 3 acres, the value of the products in 1944 was $250 or more: 
Apiaries (bee farms), Community or cooperative gardens, Country estates 
and country homes (if there is production of vegetables, eggs, milk, or other 
agricultural products either for home use or for sale), Cranberry bogs, Dry- 
lot or barn dairies, Feed lots. Fur farms (fox, mink, skunk, etc., in captiv¬ 
ity), Garbage-feeding hog yards, Greenhouses, Hatcheries (baby chicks, 
poults, etc.). Institutional farms (connected with schools, prisons, hospitals, 
etc.) Mushroom cellars. Nurseries (except for reforestation projects, or in 
connection with parks). Part-time farms (agricultural operations incidental 
to other occupation), Grazing or pasturing of livestock, Harvesting of grass 
seed, Keeping of chickens and the production of broilers (including battery¬ 
laying and battery-broiler plants), Production of medicinal or drug plants 
and herbs, Production of flowers and bulbs for sale. Production of vegeta¬ 
bles under glass, Production of vegetable and flower seeds, plants, bulbs, 
tubers, etc., Production in captivity of pheasants, quail, etc.. Production of 
mint, spices, or other special crops. Raising of Shetland or other ponies. 
Rabbit raising, Squab raising. 

If any specialized or unusual t3rpes of agriculture such as those mentioned 
above are reported, list type under Supplemental Information on page 12. 

Although columns are not provided on the schedule for obtaining reports 
for all the above-mentioned specialized operations in detail, be sure to report 
on all items that are applicable, making use of inquiries, for "other crops'’. 
Note that value of land and buildings and value of sale of products should 
be reported in all cases. 

Include in one report all such land which the operator uses for agricul¬ 
tural purposes, as previously defined, also all outlying or separate field, 
meadows, pastures, woodland, and waste land. A farm may consist of two 
or more separate tracts not necessarily adjacent. Do not include public or 
open range neither owned nor leased by the operator. If the operator cuts 
hay from land that he does not own and for which he pays no rent, include 
such acreage under Wild Hay Cut and explain under supplemental informa¬ 
tion. Large areas of land or other non-agricultural land held as a separate 
business and not used for pasture or grazing should not be included. 

The following types of establishments and operations do not require re¬ 
turns on the Farm and Branch schedules unless there are also agricultural 
operations: Canneries, Cheese factories, Creameries, Deer parks. Fish, frog, 
alligator, or snake "farms”, Fish hatcheries. Game preserves. Kennels, 
Livestock dealers (except feed lots or other farming operations). Ostrich 
“farms”. Oyster “farms” Parks, Riding academies with no farming opera¬ 
tions, SMpping pens, Turpentine “farms” or turpentine "orchards”. Distil¬ 
leries, ^ns, dryers, mills, refineries, or packing plants. Establishments of 3 
acres or more, even though locally known as "farms” on which there are no 
agricultural operations. Idle or abandoned farms which were not operated 
in 1944 and will not be operated in 1945, Cutting or gathering of forest 
products with no farming operations. Landscaping, or maintaining grounds, 



INADEQUACIES OF AGRICULTURAL CENSUS 


283 


and growing of flowers, shrubs, and ornamental etc., except where the land 
is maintained primarily for their production. Production of maple sirup or 
sugar with no farming operations, raising canaries, guinea pigs, white rats, 
or white mice, stock yards and auction yards or barns, trapping of wild 
animals. 

It will be observed that the definition given above excludes certain 
agricultural producing units because of (i) scale of operations (as meas¬ 
ured in acres and in dollars) and other elements are excluded because 
of (ii) type of enterprise. Later (Section 6) the effects of type of enter¬ 
prise will be dealt with when the question of the scope of agriculture is 
being considered. 

An estimate of the amount of agricultural production excluded by 
this definition because of insufiSLcient scale is provided by a recent 
national sample survey.^ According to this survey the total agricultural 
production (stated by producers)^ was about $25 billion in 1946 
(table 1). Of this quantity, $445 million, or 1.8% took place on “sub¬ 
census” farms (those too small to qualify as census farms). In general, 
units large enough to qualify as census farms and whose “operators” 
reside in the open coimtry, produce crops and animals in about the 
same dollar volume, but all smaller imits whether operated in the 
cities, towns or villages produce crops valued at about 5 times that of 
animals (Table 2). 

’ The National Refrigeration Survey dealt with a cross-sectional sample of households both "farm” 
and 'non-farm”. A special attempt was made to get households to report agricultural production how¬ 
ever small, by avoiding questions referred to farm or ranch when the respondent said he had no /am. 
In these cases the interviewer was instructed to tell his respondent that even though he did not have a 
farm he mi^t nevertheless be engaged in the production of some crops and animals on his pZoce and 
proceeded with the interview askmg for details on such activities. In the office these data were used to 
dassify those units whidi qualified, as 'census farms,” the smaller producers as 'subcensus farms”. A 
weakness in the survey was the inadequate means for detecting multi-operator farms and therefore thrir 
Inclusion in the sample. 

* Includes production used at home. It is realized, of course, that this information reported by the 
respondent, is subject to error due to vagumiess in concept, memory, etc. 

The question, put to all households, whether they appeared to engaged in agricultural operations 
or not, were: 

'Does head of household operate a farm?” 

(if yes) 'How much would you say the garden and field erops sold last year were worth at local 
market price? (Include home consumed as well as sold.) 

How much would you say the animals sold or the products obtained from them and sold 
last year (such as milk, eggs, furs, honey) were worth at local market price? (Include home 
consumed as wril as sold.) 

(if no) 'Even though you don’t have a farm or r^;ard youisdf as a farmer, do you have agarden, 
chickens or some other small enterprise for production of agriculture products? 

(If, No) no further agricultural questions. 

(If, Yes) 'How much were garden vegetables, fruits, berries etc. worth at the local market 
price? (Include borne consumed as w^ as sold.) 

'How mu<di would you say that animals or animal products were worth at local 
market price? (Indude home consumed as wdl as sold.) 

(Also questions on acreage, etc.) 




284 


AMEEICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1Q40 


TABLE 1 

ESTIMATED NUMBER AND VALUE OF AGRICULTURAL PRODUCTION IN 1946, 
OP FARMS AND “SUBFARMS*, U. S. BY ZONE 


(1) ! 

(2) 1 

(3) 

(4) 

Zone» 

1 

i 

1 

Estimated ‘census'* & 
‘subcensus'* farms (that 
is all units reporting 
some agricultural 
production'.'* 

1 

Estimated ‘census* farms 
i (those which qualify as 
farms according to census 
definition) and relative 
standard errori* 

1 

Estimated ‘subcensus* 
farms (those excluded 
from the census because 
scale too small) and 
relative standard error 
(2H3) 

Total 1 

Number | 

Production { 

11,788,000 1 

1 $24,957,000,000 | 

6,582,000 ± 5% 
S24,512.000.000±10% 

5,207,000±12% 

$445,000,000±12% 

Open Couniry 
Number 
Production 

5,946,000 1 

$19,784,000,000 

j 5,200,000± 5% 

$10,706,000,000±11% 

746,000±18% 

$ 77.000,000±18% 

Rural Place 
Number 
Production 

2.857.000 
$ 2,972,000,000 

890,000±13% 

$ 2,7S7,000.000±28% 

1,966,000 ±17% 
$185,000,000± 17% 

Urban Place 
Number 
Production 

2,985,000 
$ 2,202,000,000 

1 

1 491,000±39% 

1 $ 2,018,000,000±46% 

2,494,000 ±22% 
$184,000,000 ±22% 


* Zone refers to where the operator lives—not necessarily where farm is situated. 

^ Estimates from National Refrigeration Survey. 

^ Probability is about I that the difi^ence between the sample estimate and the true value (the 
value that would result from a 100% sample) will be within plus or minus one etandard error. 

It is estimated that there are about 5,207,000 subcensus units pro¬ 
ducing some agricultural products (this number is not very meanii^ul 
because of the obvious indefiniteness of “no production”). Nearly half, 
or 2,494,000 of the operators of these subcensus farms reside in urban 
tovms and cities; another 1,966,000 or 40% of the operators live in 
rural torms and villages and the remaining 746,000 or 14% are in the 
opea coimtiy.* The place of residence of the operator may not be the 
same as the location of the unit. No detail was obtained in the survey 
to indicate the kinds of agricultural activity in which the operators of 
these units are engs^ed, but considerii^ their location and scale of 
operations it would be reasonable to conjecture that they produce 

i The paititioiung of the U. S.into thxee ‘cones” follows a practice made pos^le by and established 
in connection with the Master Sample of Agriculture. By ‘urban* sone is meant the area covered by 
cities and towns having a population, in the 1940 Census of Population of 2500 or more, or otherwise 
designated by the Bureau of the Census as *uiban”. The ‘rural place” sone consists of all named places 
having at least 100 inhabitants, and otha areas with a population density of at least 100 persons per 
sQuare mils, whi<di are not included in the urban group. These places, whether incorporated or not 
have had boundaries described around them as part of the Master Sample Inject operations. The 
^pen country* sons inriudes the remaining area of the U. S. outside of the *urban” and ‘rural place* 


























INADEQUACIES OE AGRICULTURAL CENSUS 


285 


mamly poultry and eggs, milk, vegetables, fruits, nuts, rabbits and the 
like. These are items that may be of importance in food and diet 
statistics. 

Not only does the census definition of farm exclude a segment 
of agriculture on the basis of scale of operations, but it does so without 
consistency through time. The total number of farms enumerated by 
the census, other things being equal, depends on the agricultural price 
level. During years of high agricultural prices a greater number of units 
should qualify as farms than during years of low agricultural prices— 
even though physical production remains the same. This is due to the 
fact that one of the criteria chosen to measure scale is agricultural pro¬ 
duction in terms of dollar value ($250 per year). This sensitivity of the 
census farm count to the price situation could give rise to unfortunate 
complexity in the interpretation of a time series of farm counts. Specu- 

TABLE 2 


VALUE OF CROP PRODUCTION AND LIVESTOCK PRODUCTION 1946, OP 
•SUBCENSUS" AND -CENSUS" FARMS, U.S. BY ZONE* 


(1) (2) 

(3) 

(4) 

(5) 

Zone Source 

Total Census 
and Subcensus 
Farms 

Census Farms and 
Relative Standard 
Error 

Subcensus Farms 
and Rdative 
Standard Error 

Total, U. S. Total No. 

Value 

Crops No. 

Value 

Animals No. 

Value 

11,788,000 

24,957,000,000 

11,225,000 

14,267,000,000 

6,891,000 

10,690,000,000 

6,581,000 ± 5.4 
24,512,000,000± 9.9 
6,217,000± 5.5 
13,894,000,000±14.5 
5,553,000 ± 6.1 
10,618,000,000±10.6 

5,207,000±12.3 
445,000,000±12.1 
5,008,000i;11.8 
373,000,000 ±12.4 
1,338,000±24.4 
72,000,000±24.3 

Open Country Total No. 

Value 

Crops No. 

Value 

Animals No. 

Value 

5,946,000 
19,783,000,000 1 
5,587,000 
10,252,000,000 
4,901,000 
9,531,000,000 

5,200,000± 5.3 
19,706,000,000± 10.8 
4,913,000 ± 5.8 
10,191,000,000±16.1 
4,663,000± 6.2 
9,515,000,000±11.3 

746,000±17.7 
77,000,000±18.1 
674,000±19.2 
61,000,000±19.2 
238,000±40.9 
16,000,000 ±36.4 

Rural Place Total No. 

Value 

Crops No. 

Value 

AniiYiftlH No. 

Value 

2,857,000 

2,972,000,000 

2,783,000 

2 ,110,000,000 

1,219,000 

861,000,000 

890,000±12.7 
2,787,000,000±27.6 
843,000±13.1 
1,955,000,000 ±35.0 
620,000±14.6 
832,000,000±32.5 

1,967,000±16.6 
185,000,000±17.1 
1,940,000±16.9 
155,000,000 ±18.4 
599,000±29.7 
29,000,000 ±26.5 

Urban Place Total No. 

Value 

Crops No. 

Value 

AtiiTnnla No. 

Value 

2,985,000 

2 ,202,000,000 

2,855,000 

1,905,000,000 

771,000 

298,000,000 

491,000 ±39.2 
2,019,000,000±46.1 
461,000±38.4 
1,748,000,000±54.5 
270,000±53.9 
271,000,000±64.7 

2,494,000±22.3 
183,000,000+22.4 
2,394,000 ±22.0 
158,000,000±22.0 
501,000±50.8 
27,000,000±52.4 


^ Estimated from the National Refrigeration Surrey. 


























286 


AMEEICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


TABLE 3 


ESTIMATED NUMBER AND VALUE OP AGRICULTURAL PRODUCTION 1947, OP 
CENSUS AND SUBCENSUS FARMS. U.S. BY SIZE (IN ACRES) CLASS* 
(Numbers in thousands, value in S millions) 


(1) 

(2) 

(3) 

(4) 

(5) 

.... 

(6) 

(7) 

(8) 

(9) 

Value Class 

Total 

0-8 

3-29 

30-99 

i 

100-179 

180-259 

260-499 

500 A 

over 

1- 99 

No. 

3,299 

3,046 

151 

75 

1 

21 



6 


Value 

137 

127 

7 

3 

* 



* 

100- 249 

No. 

2,654 

2,160 

342 

108 

38 

3 

3 



Value 

401 

318 

57 

18 

7 

* 

1 


250- 399 

No. 

690 

303 

257 

112 

15 

3 




Value 

221 

93 

85 

38 

4 

1 



400- 499 

No. 

437 

164 

117 

111 

33 

3 

9 



Value 

182 

67 

48 

46 

15 

2 

4 


500- 999 

No. 

1,014 

194 

330 

308 

137 

39 

6 1 



Value 

693 

127 

223 

211 

104 

24 

4 


1,000- 1,999 

No. 

912 

15 

237 

414 

145 

54 

33 

14 


Value 

1,253 

20 

309 

565 

213 

81 

45 

20 

2,000- 4,999 

No. 

1,354 

14 

86 

388 

464 

190 

143 

69 


Value 

4,248 

33 1 

254 

1,153 

1,457 

586 

511 

254 

5,000- 9,999 

No. 

738 

i 

30 

74 

193 

138 

198 

105 


Value 

5,253 

— 

219 

488 

1,319 

1,008 

1,439 

780 

10,000-19,999 

No. 

531 j 

_ 

32 

50 

78 

71 

127 

173 


Value 

6,968 

— 

383 

660 

1,048 

960 

1,736 

2,181 

20,000 A over 

No. 

159 

_ 

33 

18 

25 

11 

27 

45 


Value 

5,601 

— 

1,293 

1,020 

589 

355 

746 

1,598 

Total 

No. 

11,788 

5,896 

1,615 

1,658 

1,149 

512 

546 

412 


Value 

24,957 

785 

2,878 

4,202 

4,756 

3,017 

4,486 

4,833 


* Data from National Refrigeration Survey based on a sample therefore subject to sampling 
variation. See tables 1 and 5 for indications of Uie size of these sampling errors. 

* Less than $500,000. 


lating with data available from the survey (Table 3) it appears that if 
the agricultural price level is doubled (that is roughly what took place 
between the 1940 and 1945 census) about 1,500,000 farms should appear 
in the 1945 Census which would not have qualified for the 1940 census, 
assuming of course, the number and physical volume of production of 
all units producing agricultural products remains the same. Also if 
the price level should drop to one-half the 1945 level about 500,000 
farms would be "lost”. These speculations assume of course that 
enumerations each time are complete and that the definition of a farm 





































INADEQUACIES OP AGRICULTURAL CENSUS 


287 


used in 1946 was strictly adhered to. These assumptions are not com¬ 
pletely true, as will be pointed out in section 4. This sensitivity could 
be largely eliminated if the volume of production requirement in terms 
of dollar value were put at a value level each time of census such that 
a constant physical volume requirement would be maintained. Also if 
farm counts were published separately by zones, such as the three used 
here (open coimtry, urban place and rural place), it might be possible 
to confine most of the changes due to price level fluctuations to the 
tables for towns and cities—^the open country being somewhat freer 
from such effects. 

Inadequate enumeration. Another part of agriculture that has been 
omitted by the Bureau of the Census, but included in its definition, is 
that which has been missed for one reason or another by the field 
worker, deleted in editing, or missed for other reasons. The differences 
shown in Table 4 between the surv'ey estimates for 1947 and the data 

TABLE 4 

NUMBER OP FARMS BY CRITERION OP CENSUS DEFINITION SATISFIED, 
NATIONAL REFRIGERATION SURVEY (1947) AND 1945 
CENSUS OP AGRICULTURE 


Class* 

National Refrigeration Survey, 
Jan. 1, 1947 

(estimate and standard error) 

Federal Census of Agriculture 
Apnl 1.1945 

Total “Census" Farms, U S 
Qualifying on: 

6,582,000± 5% 

5,859,000 

Value alone 

690,000 ±20% 

99,000 

Acreage alone 

747,000±20% 

552,000 

Both acreage and value 

5,145,000± 6% 

5.208,000 


of the 1945 Census of Agriculture are within the range given by twice 
the sampling error of the survey estimates, for both the total number 
of farms and value of agricultural production (when adjustment is 
made for the difference in price level between 1944 and 1946). However, 
if the number of farms is considered by classes according to the criterion 
(acreage, value or both) of the Census definition satisfied, it appears 
that about 590,000 of the total difference of 723,000 between the 
National Refrigeration Survey and the 1945 Census of Agriculture 
occurs in the class of farms qualifying on value of products alone 
(Table 4). The difference for this class is statistically significant. 

There are a number of possible explanations of this difference. A rea¬ 
sonable interpretation is that farms of this type are often missed in the 
census. It appears that these farms are mostly small in scale and, con¬ 
sequently, might not have looked like farms to the Census investigator. 






288 


AMBEICAN STATISTICAL ASSOCIATION JOIJENAL, JUNE 1949 


As a rule they were situated in areas generally regarded as “non-farm” 
or had no resident operator and therefore might more easily have been 
passed over (Table 5). On the other hand, it might be that they were 
recognized by, or known to, the enumerator, but in his estimation were 
so costly to seek out and interview that he decided to omit them. 

It is reasonable to believe that at least part of the discrepancy be¬ 
tween the NR Surv^ey and the census may be due to differences in the 
elicitation of value of products.® That is, a census investigator might 
well have visited a unit and found that the value of products (as ob¬ 
tained by him) was not sufiBicient to qualify the unit as a census farm, 


TABLE 5 

ESTIMATED NUMBER OP FARMS BY ZONE: 1940 CENSUS OP AGRICULTURE 
AND RESULTS OP NATIONAL REFRIGERATION SURVEY* 


(1) 

(2) 

(3) 

Zone 

1940 

Survey estimate^ and its 

Census of Agriculture.* 

standard error" 

Total 

6,096,799 

6,582,000±5.4% 

Open country 

5,532,374 

5,200,000 ± 5.5% 

Rural places 

491,062 

891,000 ±13.6% 

Urban places 

73,363 

491,000±38.3% 


* The two sets of estimates (1940 and 1947) are not directly comparable because of the fact that 
the estimates for the 1940 Census of Agriculture are based on location of the farm, while those from 
the National Refrigeration Survey are based on residence of the farm operator. These latter estimates 
(1947) tend to indicate more farms in the *urban places” and "rural places” sones. Thus, in the 1945 
Census of Agriculture 340,000 farm operators reported themselves as not living on the farm they 
operated. Although place of residence was not tabulated for these operators, most of them undoubtedly 
resided in urban or rural places, rather than elsewhere in the open country sone. 

* These estimates were made in connection with the Master Sample Ftoject before the 1945 data 
were available. None have been made on the more recent data. Since they are estimates they are subject 
to error. Zone refers to location of farms and not residence of farm operator as in the Survey Estimates. 

b Estimated from the National Refrigeration Survey sample. The sampling error of the estimate is 
given along side eadi figure. Zone refers to where the operator lives—^not necessarily to where farm is 
situated, as in the estimates of 1940. 

^ Probability is about 2/3 that the difference between the sample estimate and the true value 
(the value that would result from a 100% sample) will be within plus or minus one standard error. 

while for the same unit a survey investigator might have obtained a 
reported value of products that would satisfy the census criterion. In 
the absence of information as to how well value of products was elicited 
in the census and in the survey, and partly also because of the changes 
in price level, it is not possible to say which of the two figures for num¬ 
ber of farms is more nearly correct. That is immaterial, however, for 


* It should be noted that the procedure and questions for arriving at the total value of farm prod¬ 
ucts were different in the National Re&igeration Survey than in the 1945 Census of Agriculture. 

















INADEQirACIBS OF AGEICXTLTtTRAL CENSUS 


289 


this analysis. The striking feature is that it is possible for two sets of 
presumably qualified investigators to take the same definition out in 
the field, and bring back significantly different results. 

“Correction of this inadequacy will require more than just bringing 
pressure to bear on the workers for better work. Assuming a change in 
definition, changes in survey procedure may still be recommended.” 
First, out of consideration of economy, it may be feasible to confine the 
canvassing of cities and to^Tis to a sample but retain the complete 
coverage of the open county zone (Table 5). Second, require the inves¬ 
tigator to call on all households in his district (except those in towns and 
cities which are not in his sample) and make out a report for each— 
whether the household operates a “farm” or not. The identification of 
farms would be made in the central office where a consistent policy 
could be established and kept imder control. This procedure is pro¬ 
posed to eliminate or reduce unknown incompleteness in return for 
one w’hich has a calculable amount of sampling fluctuation. The 
complete census of all households in the open country whether farm or 
non-farm would probably have considerable use to market researchers, 
students of population problems, etc., who now must depend on data 
obtained on a somewhat xmsteady and not too well-known base. 

Inadequate definition of operation (farmer). In general the word 
operator (when dealing with farms) is used to describe a person who 
“operates” a farm. If two persons jointly operate a farm both would 
generally be regarded as operators and each partner would so regard 
himself. Complete independence in the operation of a farm is therefore 
not required before the status of “operator” can be reached. It may be 
helpful to define some concepts on this matter as follows: Let the 
entrepreneurial function required of each farm be performed by a 
person or a number of persons which collectively will be called the 
operatorship. Each farm must have one and only one operatorship but 
that operatorship may consist of one or more persons called operators. 
Now an operator may be associated with more than one farm through 
membership in more than one operatorship so the total number of op¬ 
erators for a given area could be a different figure than the total number 
of farms for that area. 

A distinction of this sort is not made by the Bureau of the Census. To 
quote its instructions, “For a farm operated by two or more partners 
enter only one of the partners as the operator, preferably the senior part¬ 
ner, unless the junior partner is actually conducting the operations.” 
(Itdics mine.) This definition of operator, for the sake of simplicity, 
does violence to reality. 



290 ASIEEICAN STATISTICAL ASSOCIATION JOUBNAL, JtTNE 1910 

Forcing multi-person operatorships into this mold brings about 
some curious results. There will be an undercount of the number of 
true operators—^that is, the kind which the economists, sociologists 
and the laymen would be thinking of. Likewise a number of other 
imdercounts result such as number of farm dwellings having electricity, 
etc. (only one dwelling from each farm is eligible for this information), 
mortgage debt (asked of the one operator only and of only those who 
own land), land owned by the operator personally (here again asked 
only of the one operator), etc. This difficulty could be overcome by 
adopting a more complex model than now used—one permitting of 
multi-person operatorships. 

InadeqiuUe definition of agricvUwre. Although the Bureau of the 
Census calls its quinquennial smwey a “census of agriculture” there is 
some doubt that it is comprdiensive. This “census of agriculture,” for 
example “covers” about 60% of the land area of the United States 
(Table 6). There may be some objection to the regarding of the remain¬ 
ing 40% as containing no “agriculture”. For example, land under forest 


table s 

TOTAL AREA IN LAND AND IN FARMS. U-S.. BY AQBICULTDBAL 
CENSUS YEARS 1920, 1946^ 


(1) 

(2) 

(3) 

(4) 1 

(5) 

Year 

Total Land Area 

Total Area in Farxns 

Total Farms 

(Acres) 

(Acres) 

(Per cent) 

(Number) 

1945 

1,905,361,920 

1,141,613,510 

59.9 

5,859,169 

1940 

1,905,361,920 

1,060,852,374 

65.7 

6,096,799 

1935 

1,903,216,640 

1,054,515,111 

55.4 

6,812,350 

1930 

1,903,216,640 

986,771,016 

51.8 

6,288,648 

1925 

1,903,216,640 

924,319,352 

48.6 

6,371,640 

1920 

1,903,215,360 

955,888,715 

50.2 

6,448,343 

• 

• 

• 

• 

• 


* Data feom published reports of the Bureau of the Census. 


is r^arded by the Bureau of the Census as a farm ^terprise (that is 
one on which information is obtained) if it is on a “farm”; if such land 
is not on a farm it is not reported in the “agricultural census.” Land 
under grass is similarly treated—^for sample “land used under a graz¬ 
ing permit is not to be included” as part of the areal extent of a farm. 
Even land that contributes directly to the production of food such as 
“production of maple sirup or sugar with no agricultural operations” 


















INADEQUACIES OF AGRICULTURAL CENSUS 


291 


and “picking or gathering of wild nuts, wild fruits, or wild plants 
(medicinal, ornamental, etc.) except where the land is maintained pri¬ 
marily for their production” is not reported in the agricultural census 
“unless there are agricultural operations”. (Italics mine.) 

“Fanning, or agricultural operations”, according to the Bureau, 
“Consists of the production of crops or plants, vines and trees (exclud¬ 
ing forestry operations) or of the keeping, grazing, or feeding of live¬ 
stock for animal products (including serums), animal increase, or value 
increase”. This seems like a reasonable definition of agrictdiure —except 
for the exclusion of forestry operations—^but by attempting to make 
this concept synonymous with farming the Bureau has made some 
rather curious decisions, some of which have just been mentioned. The 
resulting census appears to be one of farming rather than of agricul¬ 
ture, and should be so labelled. Or, as an alternative, the scope could be 
broadened to include non-farming agriculture^ as well. By this means 
the statistical no-man’s land, comprising 40% of all United States 
land, can be brought under adequate statistical description. By bring¬ 
ing these two segments under one comprehensive periodic census it 
wiU be possible to eliminate the unknown duplications, omissions and 
other difficulties to which the present arrangement is subject when 
the Forest Service, Bureau of Agricultural Economics and other 
agencies attempt independently to get information on the neglected 
40%. 

Inadequate unit for presentation of data for small areas. Although it 
does not publish its agricultural census data on geographic areas smaller 
than a county, the Bureau of the Census will provide data by Minor 
Civil Division if and when someone is willing to pay for it. The smallest 
unit on which the Bureau can provide data is the enumeration district, 
which is usually the Minor Civil Division (township, beat, etc.) or some 
subdivision thereof. The Minor Civil Division is of course a unit deter¬ 
mined primarily for political purposes. In many states its boundaries 
are difficult to determine and are short-lived. They do not necessarily 
describe areas having local significance such as a community. 

There is a great need for agricultural data on a small area basis which 
can be effectively used for land use experimentation, research and 
planning. There is not only a need for data for areas smaller than a 
county but also for areas larger than a county but smaller than a state, 
e.g., the irrigated valleys of the west. 

It is suggested that for agriculture a “natural” area be adopted in- 

^ indude all ftnimnl and plant production including fish and a number of oth^ categories 
presently omitted by the Bureau of the Census. 




202 AMEEICAN STATISTICAL ASSOCIATION JOTJENAL, JUNE 1949 

stead of the Minor Civil Division. This area might be based on that 
which has a common drainage, such as a valley. These will be per¬ 
manent and will have local significance. They could be the bricks out 
of which valley authority areas could easily be constructed. They 
would not interfere with statistics for counties but would greatly fa¬ 
cilitate the compilation of useful valley data. 

Summary of proposals. A number of inadequacies in the census of 
agriculture have been discussed, using the 1945 census as the main 
illustration. Some of these were traced to difliculties of concept such as 
the conflicts between that which is agricultural and that which is 
farming and the operator as a person versus the operatorship as an 
economic function. Improvement can be made here by proper labelling 
and by broadening the scope. 

To improve the completeness of enumeration it is suggested that the 
questionnaire be constructed suitable for use with all households 
w’hether they be associated w^ith farms or not. In the open country all 
households should be canvassed but in cities and towns the canvassing 
should be confined to a sample (in the agricultural census taken above) 
in order to avoid hea^-y costs. 

It is suggested that in place of the Minor Civil Division a natural 
area based on drainage be adopted as the basic unit for providing in¬ 
formation on small areas. 



THE EDGE MARKING OF STATISTICAL CARDS 

A. M. Lesteb 
Montreal 

T he system of edge-marking of statistical cards described by Dr. 

Thurstone in the September issue of the Journal has an additional 
application of considerable importance which he did not mention. This 
was discovered in the work of analyzing flying accident causes during 
the recent war. The main data for each accident could be expressed in 
from 100 to 200 ‘‘Yes-No” answers. The total number of accidents to 
be dealt vith was sufficient to make a fully mechanical punch-card 
system a reasonable possibility, but it was found that reference to a 
brief written history of each accident was so frequently needed during 
analysis that hand-sorting of cards which could contain such histories 
was superior to full mechanization. Cards were utilized with edge 
punching and the needle was used for most sorting operations, particu¬ 
larly, of course, for replacing the cards in their filing arrangement. 

The usual objections to the punching and pinning method were 
found, however, i.e. errors of punching were difficult to repair satisfac¬ 
torily and virtually involved making out a new card each time; the 
edges of the cards eventually tended to tear after much use, and to pin 
through several thousand cards in search of an element occurring only 
a very few times was a laborious procedure which might be unreliable 
owing to cards sticking in the pack. 

A system of marking the rims of the cards instead of punching the 
edges was therefore tried, meaning the actual thin rims of the cards and 
not the one-eighth of an inch of tilie front nearest that rim. Fairly thin 
cards were being used which ran at about 100 to the inch, and it was 
found not only that a tiny mark on the rim of one of them made with 
the back of a pen using India ink or red ink was easily visible (when the 
cards were carefully blocked so that their rims presented an even sur¬ 
face) but also that quite small deviations in the position of the marks 
could be at once detected. The rims on some types of card were too 
spongy and caused the ink to run as blotting paper would, but with 
good quality cards and a fine pen a mark could be made quickly and 
accurately. 

This form of rim marking does not have the advantage of Dr. 
Thurstone’s method, that both sides of the card can be used. On the 
other hand it can claim the following points in its favor as compared 
with Dr. Thurstone’s scheme, or as compared with the ordinary edge 
punching scheme: 


293 



294 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

(1) etiae of reference —count can be made while the cards are in their filing 
drawers if they are tapped into alignment to ensure accuracy; 

(2) ease of extraction —cards can be picked out without disarranging the 
other cards. For many types of analysis this is a particularly valuable 
asset; 

(3) ease of correcting errors —^ink marks on the rims of cards can be bleached 
or whitened over just as they can on the front of a card but they also 
can be shaved off with a sharp pair of scissors without seriously impair¬ 
ing the trim of the cards; 

(4) marking of the rims of the cards can be used in conjunction with edge 
punching and with Dr. Thurstone’s type of edge-marking. 

This type of rim marking might be of value in many statistical analy¬ 
ses although, of course, it has its limitations. It is particularly applica¬ 
ble where, as in accident statistics, a large number of different elements 
may occur only a few times each in a large universe. In this type of 
statistical analysis one of the most frequent questions to be answered 
is how often a certain element occurred in several thousand instances, 
and what were the circumstances in each case. If the occurrence of the 
element in question has been marked on the rim of the cards, it is a 
matter of a few minutes to run through the drawers and pick out the 
cards m question for further study. For this type of analytical pro¬ 
cedure, rim marking would often be quicker than mechanical sorting 
from punched cards, as well as being enormously cheaper. 



CONRAD ALEXANDER VERRIJN STUART (1865-1948) 

Another notable statist, honorary member for thirty-five years of 
the American Statistical Association and outstanding member for fifty 
years of the International Statistical Institute, has become a part of 
the history of his major field of work. His friend and compatriot, H. S. 
Methorst, who succeeded him as Secretary General of the Institute in 
1911 and held that post for a quarter of a century, will write for its 
Bulletin an obituary note based on a wealth of information personal and 
professional far greater than mine. 

It remains for me to outline those aspects of Verrijn Stuart's life 
which will interest especially the readers of this Journal and accounts 
of which have been printed in other languages than Dutch. I knew him 
as a fellow member of the Institute for half a century during which 
period he missed not one of its twenty biennial sessions (1899-1938); in 
those sessions he played a leading role. In 1899 he became director of 
the Central Statistical Bureau of the Netherlands the reorganization 
of which he described at that time in the Institute’s Bulletin and some 
twenty years later for our Association in Koren’s “History of Sta¬ 
tistics.’’ In 1902 he printed m the Bulletin a paper on Birth Rates, Still 
Births and Infant Mortality in certain Dutch Cities and County Dis¬ 
tricts. Nine years later as Secretary General of the Institute and 
Chairman of the Committee which organized its Thirteenth Session he 
welcomed its members and guests to the Netherlands and The Hague. 

After the first part of the second Thirty Years War he presented to 
the Institute as its Rome session (1925) a note on the Representative 
(Sampling) Method and five years later at the Tokio session a Report 
on National Capital and Income in the Netherlands. In the following 
year he submitted to the Financial Committee of the League of 
Nations a Memorandum on the Gold Question. 

In the Revue of the International Statistical Institute which began 
in 1933 he published a paper (1935) on the Causes of Death and 
another (1938) on Cancer in the Netherlands. Of more permanent 
interest probably are his reports on the Mexico City (1933), London 
(1934) and Athens (1936) sessions of the Institute. The suggestion 
which he made and repeated in those reports looking towards an im¬ 
provement in the scientific character of the sessions grew out of an 
experience almost unparalleled in length and depth, during four years 
of which he had been Secretary General and then for twenty-seven 
years an independent friend and critic of his successors, a suggestion 


295 



296 AaiBEICAN STATISTICAL ASSOCIATION JOUKNAL, JUNE 1949 

in which I heartily concurred. He thought that the work of the Institute 
would be improved by a drastic reduction in the number of topics con¬ 
sidered at a session, a reduction in the number and importance of 
section meetings, and a concentration on a few subjects to be probed 
in the general sessions. 

This suggestion was in line with changes made at the Washington 
session in 1947, changes far more radical than those to which he had 
looked forward ten years before. We may hope indeed that the for¬ 
ward steps in the progress of international statistics then taken will be 
found a generation hence to be comparable in importance with those 
taken between 1875 and 1890 when the Institute arose out of the ashes 
left by the Franco-Prussian War which had kiUed its predecessor, the 
International Statistical Congress. It was for some such an advance 
that Verrijn Stuart worked and hoped. 

Walter F. Willcox 



PROCEEDINGS 


108TH ANNUAL MEETING 
HOTEL STATLEE, CLEVELAND, OHIO 

MINUTES OF THE ANNUAL BUSINESS MEETING 

The American Statistical Association convened for its 108th Annual Business 
Meeting on the evening of December 28,1948, at the Hotel Statler in Cleveland, 
Ohio. A motion was made and passed to approve the minutes of the last Annual 
Business Meeting held in New York at the Hotel Commodore, December 29, 
1947. 

Aryness Joy Wickens, Chairman of the 1948 Committee on Fellows reported 
concerning the fellows elected by the Committee.* 

George W. Snedecor announced the appointment of W. Edwards Deming as 
the new member of the Committee on Fellows to serve for a period of five years 
from January 1949 through December 1953. Merrill M. Flood gave the reports 
of the Secretary and Treasurer.* 

Walter A. Shewhart, one of the retiring Directors of the Association read the 
report of the Board of Directors on activities for 1948.* This report had been 
approved by the Board at its meeting on Monday, December 27, 1948. 

Isador Lubin announced that the ASA and the American Economic Association 
would participate jointly in a memorial service for Wesley C. Mitchell to be held 
at the Hotel Cleveland, Wednesday, December 29, 1948. 

Joseph Berkson reported approval of the activities of the Biometrics Section 
during 1948. 

Isador Lubin read the minutes of the Commission on Statistical Standards and 
Organization and asked that the recommendations be presented to the member¬ 
ship for such action as they might wish to take. 

In the discussion following the motion to accept Mr. Lubin’s report, Theodore 
Brown stated that he thought the report opened the way for, or even by inter¬ 
pretation directed, the Statistical Association to investigate any situation in 
which it did not approve of the statistical results. Mr. Brown said that he had not 
the slightest objection to a Commission which would set up standards of good 
quality in statistical methodology or of ethics in their use, but that in general he 
believed the work of the Commission should be strictly limited to the setting of 
such standards. 

He indicated that the Commission might well act as an arbitrator in a situa¬ 
tion in which a dispute had arisen involving either statistical procedures or their 
interpretation, but the Association participation, if wise, would be limited to 
those cases in which the request for a review of a situation originated outside of 
the Association. 

He took very definite exception to the report which seemed to him, from the 
single reading of Mr. Lubin, to imply that the Association might, on its own 
initiative, attempt to investigate any statistical procedure in private industry 
which in the opinion of the committee, or the members representing the Asso¬ 
ciation, or of the Association itself was deemed to be bad statistical practice. 

* See the report of the Committee in this issue of the Journal 


297 




29S 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


He stated that, "Such police investigations into the work or actions of private 
business cannot be carried out within the Aimerican system of free enterprise 
and might react so as to bring serious harm to the Association.” 

Phillip J. Rulon moved that Mr. Lubin’s report be accepted without implica¬ 
tion that its recommendations be carried out. In further discussion of the motion 
Harold Hotelling requested clarification of what ASA action would result in 
"regimentation of private enterprise.” Isador Lubin read the following excerpt 
from the annual membership meeting at Atlantic City in 1947. "The report of 
the Committee on Standards was approved after some discussion and recom¬ 
mendations that the standing committee to be appointed be expanded to in¬ 
clude subcommittees of Association Fellows who would work on the develop¬ 
ment of standards.” Mr. Brown maintained that while he agreed with the 
original vote of 1947, he did not agree with the proposed recommendations. 

Milton Epstein proposed an amendment to the motion which would provide 
that the words “without implication” be striken from the motion. Theodore H. 
Brown moved that the entire matter be referred to the Council instead. The 
Epstein motion was put to the vote and was carried by a substantial majority. 
Morris M. Copeland moved that the motion be amended to provide that the 
report be referred to the Council without recommendation. This motion was 
carried by a standing vote. 

President Snedecor expressed his great appreciation of the work of the various 
committees carried on during 1948, and expressed his particular gratitude to the 
Committee on Nominations for its excellent work in preparing the slate of officers. 

Gertrude M. Cox, Chairman of the Committee on Elections, read the official 
report of the Committee.* 

Simon Kuznets, newly-elected President of the Association, accepted the chair, 
and as his first official act, gave the floor to George W. Snedecor, retiring presi¬ 
dent, who delivered his presidential address.** 

President Kuznets expressed the gratitude of all members of the Association to 
George Snedecor for his active leaderslup during 1948. 

Merrill M. Flood annoimced that the 1949 Annual Meeting would be held in 
New York, December 27-30, 1949. 

Helen M. Walker, reporting for a Resolutions Committee consisting of herself 
and Morris M. Copeland, presented the following four resolutions; 

RESOLVED: That the officers and members of the Ajnerican Statistical Asso¬ 
ciation express to Gale Ober and the members of his committee their sincere ap¬ 
preciation for the careful planning of all the arrangements for these meetings. 

RESOLVED: That the officers and members of the ASA express deep apprecia¬ 
tion for the excellent program prepared by members of the Program Committee, 
namely: Joseph Berkson, Ernest Blanche, John Cover, A. Ford BEinrichs, Simon 
Kuznets, Rensis likert, Abraham Wald, Allen Wallis, Harry Wellman, Aryness 
Joy Wickens, and Merrill M. Flood, Chairman. 

Whereas the Placement Committee of the New York Chapter has, during the 
past year, carried on a remarkably effective service to its members, RESOLVED: 
That the Board and Council give serious thought to ways in which such place¬ 
ment service can be more adequately financed and can be extended on a nation¬ 
wide scale. 


* See Report of the Committee, in issue of the Joumat. 

** March, 1949, JoiemdL efthe ASA. 




PROCEEDINGS OP 108TH ANNUAL MEETING 


299 


Whereas those members of the ASA who have major interest in the social sci¬ 
ences constitute a large proportion of the present membership of the Associa¬ 
tion, and whereas many of these members have expressed dissatisfaction with 
the benefits received from their membership, RESOLVED: That the Board and 
Council give serious thought to ways and means of providing a balanced program 
of services to all its members. 

These four resolutions were voted on by the membership and approved. 

The following two resolutions were transmitted by Miss Walker without any 
recommendation from her Committee. 

Resolution on Marriage and IXvorce SiaUstics: Whereas, the American Sta¬ 
tistical Association recognizes the need for more adequate mariiage and divorce 
statistics, to provide data needed in statistical research in many areas such as 
demography, health, business, welfare and others; and 

Whereas the development of national vital statistics of marriages and divorces 
depends upon cooperative state-federal relationships along the lines which have 
proven effective for the vital statistics of births and deaths: 

RESOLVED, that the American Statistical Association recognizes the need for 
state centralization of marriage and divorce records and statistics and for their 
integration with other vital records and vital statistics; and urges all states not 
yet operating such an integrated system to initiate such a program at the earliest 
possible time; and 

RESOLVED, that the American Statistical Association calls upon the Federal 
Security Agency, through its National Office of Vital Statistics in the Public 
Health Service, to encourage the development of systems of marriage and di¬ 
vorce records and statistics in every state leading to maximum comparability 
and prompt availability of data; and to undertake the development of detailed 
national marriage and divorce statistics, adequate to meet the pressing needs 
for such vital statistics. 

Resotution on Birth Statistics: Whereas, birth statistics are essential to de¬ 
veloping local, state, and national estimates of population changes, determining 
trends in family size, and are required for planning and evaluating health, wel¬ 
fare, and related programs, and 

Whereas, the usefulness of birth statistics would be increased by improved 
registration completeness and by a current knowledge of the completeness in 
which births are registered in local areas and by various characteristics of the 
population, and 

Whereas, such a knowledge would also provide the basis for promotion of 
complete registration, 

RESOLVED, that the Ameiican Statistical Association strongly urges that a 
uniform, nation-wide test of registration completeness be carried out in con¬ 
junction with the 1950 decencial census of population in order to provide the re¬ 
quired measures of registration completeness on local, state, and national levels 
and that every effort be made to obtain more complete registration. 

The membership voted to refer these two resolutions to the Council for con¬ 
sideration. 

Aryness Joy Wickens rose to present the following resolution. 

RESOLVED, that a special Sub-Committee be set up by the Committee on 
Committees to consider and formulate specific standards for the selection of 
Fellows, in carrying out the constitutional provision that fellows shall be "statis¬ 
ticians of established reputation”; that the recommendations of this special 



300 


AMERIOAX STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


Sub-Committee be forwarded to the Board and Council for its approval; where¬ 
upon the approved standards shall constitute the criteria for the selection of 
fdlows. 

Theodore H. Brown, Robert W. Burgess, and Stuart A. Rice spoke for the 
resolution. W. Edwards Dcming took the position that the Committee on 
Fellows should review its own standards. The resolution was put to the vote and 
was passed in the form in which Mrs. Wickens presented it. 

Eugene Pike introduced a resolution which would provide that the Council, 
where it is in the interest of the membership, state those matters of policy before 
it and ask the membership to communicate with the Council concerning it. 
Milton Epstein, in discussing the Pike resolution, asked that one-half day of 
each ATimifl.! ]Meeting be set aside for business meetings so that the membership 
might have an opportunity to prepare resolutions and discuss all matters of 
policy before the Association. The resolution was voted as presented. Helen M. 
TTalker reminded the assembled members at this point that annual business 
meetings are not mandatory under the new Constitution and that any such 
provision would have to be made on the initiative of the officers of the Associa¬ 
tion. 

The meeting was adjourned. 

Report of the Board of Directors 

The year 1948 has been one of transition in the affairs of the Association. It has 
been marked by the orderly movement toward the organization structure en¬ 
visaged in the new Constitution, by the reorganization of the national office under 
a new Secretary-Treasurer, and by an unavoidable increase in membership dues. 

The new Constitution was adopted by mail ballot of the membership in Febru¬ 
ary 1948 and became effective on January 1, 1949. Nominations and elections of 
officers for 1949 were in accordance with the new Constitution. For the first 
time in the history of the Association the membership voted by mail ballot for 
national officers. The representatives of the newly formed Districts were also 
elected to the Council. The Council, now the policy forming body of the Associa¬ 
tion, held its first meeting on December 29,1948. 

The lOSth Annual Meeting of the Association was held in Cleveland between 
December 27 and 29. Joint sessions were held with the American Economic 
Association, the American Marketing Association, the Econometric Society, the 
American Farm Economic Association, the Population Association of America, 
the American Public Health Association, the Institute of Mathematical Statis¬ 
tics, the Biometric Society, and the Ohio Section of the American Society for 
Quality Control. Over 600 people attended the sessions. 

It is noteworthy improvement that the planning of the 1948 Annual Meeting 
program was completed before October. Moreover, the 1949 Annual Meeting 
program has already been outlined, approved by the 1948 Program Committee, 
and referred to the 1949 Program Committee for consideration and further 
planning. 

Definite arrangements have been concluded for a four-day meeting in New 
York City during the Christmas holidays of 1949. The fortunate circumstance 
that both the American Association for the Advancement of Science and the 
Allied Social Science Associations will be meeting in New York at that time 
made it possible for the Association to plan combined sessions that will cover an 
exceptionally broad subject field. Joint sesdons have already been scheduled 



PROCEEDINGS OF 108TH ANNUAL MEETING 


301 


with the American Economic Association, American Farm Economic Associa¬ 
tion, American Marketing Association, American Psychological Association, 
American Society for Testing Materials, Biometric Society, Econometric Society, 
Institute of Mathematical Statistics, Population Association of America, and 
the Housing Research Committee of the Social Science Research Council. There 
are tentative plans for other joint sessions with the American Society of Mechan¬ 
ical Engineers, American Astronomical Society, and the Psychometric Society. 

The situation with respect to Association membershp is not entirely hearten¬ 
ing. It will be recalled that the Board of Directors reluctantly recommended a 
substantial increase in membership dues from S5.00 to S8.00 per annum, the 
increase being financially unavoidable in order to meet continually increasing 
costs of printing and other services in an inflationary period. This decision was 
ratified by mail ballot of the membership in February 1948. The withdrawal of a 
portion of the membership in the face of this dues increase, either because of a 
marginal interest in statistics or because their real incomes were adversely af¬ 
fected by the inflation, had, of course, been anticipated and discounted in advance. 
The actual result for 1948 is that the size of the Association has remained about 
constant. About 800 members had dropped out and an about equal number of 
new members have come in. While this withdrawal of 800 members is regrettable, 
it is surely not a recurrent item. During the year, new membership applications 
totaled approximately 725 as against 1,100 in 1947. This slackening of the rate 
of increase can probably be attributed to the deterrent effect of the higher dues 
in this and other Associations, and to the fact that vigorous canvassing in 1946 
and 1947 of the most likely sources of new members has forced recourse to more 
marginal groups from the standpoint of their concern with statistics. 

The main organizational change in the national ofdce has been the appointment 
of a new Secretary-Treasurer after the resignation of Dr. Lester S. Kellogg from 
that position at the 1947 Annual Meeting. Sylvia C. Weyl took over as Acting 
Secretary-Treasurer on an ad-interim ba^s until May. During the spring of 1948 
an ad hoc nominating committee, under Samuel S. Wilks, energetically searched 
the field for a man combining the many managerial and scientific attainments 
necessary for the executive leadership of the Association. After considering a 
number of candidates Merrill M. Flood was chosen, and his nomination as 
Secretary-Treasurer was unanimously approved by the Board of Directors. In 
addition to his duties as Secretary-Treasurer, Dr. Flood served as Chairman of 
the 1948 Program Committee which organized the 1948 Annual Meeting and 
planned the sessions to be held in 1949. The Council of the Association at its 
December 29 meeting re-elected Dr. Flood for a 3-year term. 

Two new chapters were added to the Association in 1948. A Saint Louis 
Chapter is now functioning regularly, and the old and established Sacramento 
Statistical Society has become a chapter of the Association. 

A temporary Committee on Publications was appointed by President Snede- 
cor, with William G. Cochran as Chairman to outline the policies to be followed 
by the various Association publications, to suggest better means of cooperation 
among them, and to make other such recommendations within this general area 
as it deemed proper. This Committee has now tendered its report to the Board 
of Directors. Under the new Constitution, the standing Committee on Publica¬ 
tions will continue this function. 

The Commission on Statistical Standards and Organization, under the chair¬ 
manship of Isador Lubin, was organized and held its first meeting in November. 



302 


AMBEICAN STATISTICAL ASSOCIATION JOX7BNAL, JUNE 1949 


Three of its members participated in the survey by the Social Science Research 
Council of public opinion polling methods—with special reference to the inaccu¬ 
rate forecasts of the 1948 election returns. Several problems of major importance 
have been placed before the Commission. It will surely be in a position of in¬ 
fluence and develop rapidly during 1949 as a positive force for the advancement of 
statistics. 

TTith the 1950 Census of the United States already approaching, the Census 
Advisory Committee has been actively engaged in aiding the Director of the 
Census in reviewing and appraising schedules. It has screened many requests 
from the public to the Census Bureau for inclusion of additional questionnaire 
items. This Committee has an added responsibility now that sampling methods 
are under fire and yet of such critical importance in the work of the Bureau of 
the Census. 

Within the field of Association publications, the Biometrics Bulletin has been 
increased to four 64 page issues, and both format and contents have been greatly 
improved. The name of the publication has been changed to Biometrics. The 
American Statistician, launched in the summer of 1947, has been very well re¬ 
ceived by the membership and has already established itself as a valuable period¬ 
ical of the Association. 

The Association cooperated informally in the work of the National Bureau of 
Economic Research under Frederick C. Mills, for the Hoover Commission on the 
Organization of the Executive Branch of the Government. At the request of 
Dr. Mills, President Snedecor arranged for an independent study of the statistical 
activities of the Federal Government and preparation of a report for the member 
group. John W. Tukey canvassed a representative group of members of the 
Association, many of whom are familiar in detail with the statistical practices 
and problems of the Federal Government, and obtained their evaluation of the 
caliber of federal statistical work and their recommendations on its consolidation, 
extension and improvement. His excellent report, which summarizes the judg¬ 
ments obtained was forwarded to Dr. Mills. 

The International Statistical Institute advised the ASA of the adoption of its 
new statutes, which provide for the aflUiation of national and international sta¬ 
tistical societies. The ISI has been reorganized to work as an active professional 
organization dedicated to the furtherance of statistical science throughout the 
world. In view of the similarity of purpose of the Institute and the Association, 
Stuart A. Rice, president of the ISI, expressed the hope that the Association 
would request aflBiation. The proposal is being considered by the Council. 

The Association has an increasing interest in the applications of statistics to 
management problems, many of its members are active in such fields as business 
economics, marketing, and quality controL The National Management Council, a 
federation of national organizations concerned with management problems has 
invited the Association to become a member. Among its present members are the 
Society for the Advancement of Management and the American Society of 
Mechanical Engineers. The invitation will be acted upon by the Council. 

The Eastern North American Region of the Biometric Society requested 
afSliation with the ASA during the past year. Decision on this application will be 
made by the Council. The Biometric Society, an international organization of 
biometrics workers, decided in 1948 to designate Biometrics as its official organ. 
Block subscriptions to Biometrics were made available to members of the soci¬ 
ety at a reduced rate as a means of assistance for the new Society. 

Plans for several new activities of the Association have been partially de- 



PROCEEDINGS OP 108TH ANNUAL MEETING 


303 


veloped during the year. Among these are establishment of a speaker’s bureau to 
serve the chapters, formation of a research group to study the program and 
operations of the Association, and intiation of regional District meetings. Some 
or all of these new projects will be activated during 1949. These new projects are 
an integral part of your Board’s program during 1949. 

Regrettably, the need for economy during 1949 made it necessary to postpone 
indefinitely the project for launching the periodical “Statistical Reviews.” 

The past year has been one of unavoidable financial stress and structural re> 
organization. The Board considers that the major problems involved in this post¬ 
war reorganization have been faced and are now on the road to solution. The task 
ahead of the Association is to consolidate organizationally and financially and at 
the same time to continue to expand its membership and carefully selected activ¬ 
ities, in the face of an inflationary trend that is prejudicial both to professional 
societies and to the memberships they serve. 

George W. Snedecob, President 

IsADOR Lxjbin 

Lowell J. Reed 

Walter A. Shewhart 

Frederick F. Stephan 

Samuel Stouffeb 

Willard L. Thorp 

Merrill M. Flood, Secretary^Treaaurer 
The Secretary's Report on Membership 

There were 4,720 members remaining at the beginning of 1948 after 189 were 
dropped for non-payment of dues. Although 725 new members joined during 
1948, and 27 were reinstated, there was a net decrease of 70 bringing the total 
on December 31, 1948 to 4,650. 

The 1948 membership is composed of the following groups: 


Honorary members. 

. 13 

Fellows. 

. 138 

Student members. 

. 290 

Regular members. 

. 4,209 

Total Membership. 

. 4,650 

Corporate Members. 

. 6 

Over 200 members resigned during the year and some 600 were dropped for 
non-payment of dues. The primary cause for this heavy loss in membership is 
undoubtedly the increase in dues from $5.00 to $8.00. The unfortunate delay in 


settling the matter of dues increase, and the consequent lateness in getting dues 
notices to the members, also had an adverse effect on membership. 

The members of the Biometrics Section, at the end of December 1948, number 
960, of whom 170 are associate members and 790 are also regular members of the 
Association. 

The deaths of the following members were recorded during the year: Wesley C. 
Mitchell, Fellow; Edward G. Benson, Bertram Butler, Ruth Dawson, Valentino 
Dore, T. Bertrand Graham, Edward W. Higgins, Arthur Hurd, Walter E. Mag- 
ney, Douglas W, Oberdorfer, HaUie K. Price, O. A. Pope, H. M. Tompkins, John 
H. Watkins, Begrdar Members. 


Merrill M. Flood, Secretary 









304 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


Report of the Nominating Committee 

The Nominating Committee of the American Statistical Association announces 
the election of the following officers for the year 1949. 


President 
President Elect 
Tice President 
3 years 
2 years 
1 year 


Simon Kuznets 
S. S. Wilks 

Dorothy S. Brady 
H. A. Freeman 
Lester S. Kellogg 


Directors (3 year term) 


District Representatives 
Western District 

North Central 
Southeastern 


C. H. Goulden 
L. L. Thurstone 

Maurice 1. Gershenson 
Henry B. Moore 
Howard L. Jones 
Samuel Weiss 
Morris H. Hansen 


They were selected as persons 

(1) who are distinguished in their contributions in applying methods to prob¬ 
lems in various fields 

(2) who are acquainted with and contributing to statistical theory 

(3} who represent the various fields of membership interests and the different 
geographical regions of the country. 

Respectfully submitted, 

Nominating Committee: 

Milton Fkibdman 
Chables F. Sable 
Mobtimeb Spiegelbcan 
Holbboos: Wobeing 
Gebtbude M. Cox, Chairman 


Report of the Committee on Fellows 

The American Statistical Association has announced the election of four of 
its members as Fellows—^three American statisticians and one Canadian statisti¬ 
cian. (There are now 145 Fellows of the American Statistical Association from a 
total membership of about 4,700.) 

The newl 5 " elected Fellows are: 

Eugene L. Chard, Professor of Civil Engineering at Stanford University, who has 
developed special courses for teaching statistics for engineers, and who has 
been a leader in the practical application of statistics to engineering. 

TjaUing J, Koopmans, of the Cowles Commission for Research in Economics at 
the University of Chicago, who has done outstanding work in the application 
of statistics to economics. 

W. Allen Wallis, Professor of Statistics and Business Economics at the School of 
Business at the University of Chicago, who was Administrative Director of 
the Statistical Research Group of the Office of Scientific Research and Develop¬ 
ment which developed and supplied answers to many statistical problems of 
military importance submitted by the armed services. 



PROCEESDINGS OP lOSTH ANNUAL MEETING 


305 


J, W. Hopkins, Canadian biometrician, of the Division of Biology and Agricul¬ 
ture of the National Research Laboratory of Canada and Coordinator of the 
Special Committee on Applied Mathematical Statistics, of the National Re¬ 
search Council of Canada. His principal work has been done in appl 3 dng sta¬ 
tistical methods to experiments with grains and livestock. During the war he 
served as scientific advisor to the Air Vice Marshal of Canadian anti-sub¬ 
marine warfare. 

MINUTES OF THE MEETING OF THE COMMISSION ON 
STATISTICAL STANDARDS AND ORGANIZATION 

Present: S. Wilks 

W. Shewhart Absent: L. Reed 

F. Croxton 
I. Lubin 

CHARTER RECOMMENRATIONS 

The Commission recommends that the Committee on Committees, in prepar¬ 
ing the charter for the Commission on Standards, should define its functions and 
competence in the terms stated in the report approved by the members of the 
American Statistical Association at its annual business meeting on January 25, 
1947, as follows: 

The Committee should have rotating membership based on a three year 
term; one-third of the initial membership to be appointed for one year, one- 
third for two years, and one-third for three years. Members should be 
eligible to reappointment. Election should be made by the Board of Direc¬ 
tors after consultation with the Committee. 

The functions of the Commission should be: 

A— to provide a tribunal to render opinions and recommendations on con¬ 
troversial issues relating to statistical procedure and presentation of sta¬ 
tistical material. 

B—develop a list of minimum standards for published statistical materials 
C—upon request from governmental bodies, review actual or proposed un¬ 
dertakings and make recommendations relative to standards. 

The Committee might eventually develop a code of ethical practices in 
statistical work. 

It is recommended that these terms be amplified to the extent of adding at the 
end of the last sentence above the words ^with a view to enhancing the status and 
acceptance of statisticians by the general public.” 

Rules of Procedure. The Commission decided to make no recommendations 
relative to rules of procedure. It was the consensus that these rules should be 
developed as experience dictates. 

Mernbersihip. The Commission recommends that Joseph Davis of Stanford 
University and Samuel Stouffer of Harvard University be added to its member¬ 
ship. 

Budget and Staff. It is impossible at this time to make any definite estimate as 
to the budgetary requirements of the Commission. Its financial requirements 
will depend upon the amount of work that will be undertaken during the coming 
year, the nature of the projects it undertakes and the extent to which Govern¬ 
mental agencies will finance studies and surveys made upon Governmental re¬ 
quest. 



306 


AMEBICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


The Conunissioii, however, feels that it is essential that there be a part-time 
secretary available to it. Eventually it will be necessary to have a full-time 
secretary. In order to finance a part-time secretary and the expenses of Commis¬ 
sion and Sub-Commission members, it is estimated that for the year 1949 ap¬ 
proximately $15,000 will be required if the Committee is to undertake projects 
that are listed below. 

Projects 

The Commission gave consideration to possible projects that it might under¬ 
take in the immediate future. 

1. Joint Committee on the Economic Report. The Commission had before it a re¬ 
quest submitted by- the Staff Director of the Joint Committee on the Economic 
Report of the House and the Senate relative to its reports on “Statistical Gaps” 
and “Economic Indicators.” The Commission felt that this request should be 
complied with and that it should undertake to study and assess both of these 
Joint Committee reports. It is proposed to set up a Sub-Commission at an early 
date to undertake this project. 

2. Consumer Price Index. The Commission feels that a request from the Com¬ 
missioner of Labor Statistics to advise the Bureau of Labor Statistics concerning 
technical questions involved in constructing the Consumer’s Price Index be com¬ 
plied with. It was the feeling of the Commission, however, that the success of 
projects of this sort will be greatly enhanced if its services were requested before 
any definite action has been taken by the agency involved. In other words, the 
Commission would prefer that it be called in at the time that technical problems 
are in process of being studied rather than to pass judgment at a later date upon 
already determined procedures. 

3. Standards for Publication of Federal Statistical Data. The Commission agreed 
that it should comply with the request of the Bureau of the Budget for a review 
of the Bureau’s report on “Standards for the Publication of Statistical Data.” It 
felt, however, that it was not within its competence to comply with the Bureau’s 
request that it suggest principles that should guide the Federal Government in 
its release or suppression of statistical information during an emergency. 

4. The Adequacy of Present Types of Federal Statistics. At the April 1948 meet¬ 
ing of the Board of Directors a resolution by Mr. Solomon Barkin to the effect 
that the American Statistical Association should investigate its responsibilities in 
the field of industrial statistics was referred to this Commission. At a later date 
the Labor Advisory Committee of the Bureau of the Budget adopted a resolution 
submitted by hlr. Barkin requesting that the ASA be called upon to establish 
a committee to appraise the adequacy of present types of statistical data and the 
direction in which statistical information should be developed. 

The Commission considered these resolutions and is of the opinion that no ac¬ 
tion should be taken upon them until the report of the Hoover Commission is 
made public. Considerable time and energy has been devoted by the Staff of the 
Hoover Commission to the question of the collection and development of federal 
statistics. Indeed, a paper on the work of the Hoover Commission in the field of 
statistics will be delivered at the forthcoming azmual meeting of the American 
Statistical Association by Dr. Mills. The Commission recommends that any pos¬ 
sible action be deferred until the Hoover Report can be studied and analyzed. 

5. The Kinsey Report. It has been suggested that the Commission review the 
Kinsey report with a view to pasang judgment on the methodolo^cal statistical 



PROCEEDINGS OF 108TH ANNUAL MEETING 


307 


techniques used by the author. It was the unanimous opinion of the Commission 
that we do nothing in this field unless specifically requested by a representative 
body of American citizens. 

6. Polling Techniques, The widespread criticism of the election polling results 
has resulted in requests that the Commission initiate studies of the techniques 
used by the various polling organizations. In view of the fact that the Social 
Science Research Council has appointed a special committee to study the elec¬ 
tion polls and in view of the further fact that two members of this Commission 
are on the SSRC Committee, it was felt that no action should be taken in this 
field at this time. The Commission was aware of the fact that further work in 
the field of polling would have to be done after the SSRC Committee completes 
its report. However, provision has already been made for such work by the 
National Research Council and the Social Science Research Council. Arrange¬ 
ments have been made for stud 3 dng special problems such as sampling, inter¬ 
viewing, and panels. It may be advisable after these studies have been completed 
for the Commission to look further into the question of polling techniques. At the 
moment, however, it feels that anything it might undertake would be more or 
less in the nature of duplicating work already under way. 

Independent Investigations, The Commission feels that in those instances 
where it is not specifically requested by a Government agency, by a foundation 
or similar body which financed or sponsored a study, or by a representative body 
of citizens to undertake investigations of projects or of statistical standards it 
should act on its own initiative only in those instances where the public interest, 
domestic or international, is concerned. As problems arise that are affected with 
the public interest, the Commission will take under consideration the advisibility 
of investigating them with a view to pointing out to the public such weaknesses 
and inadequacies of statistical techniques and presentation as it may find. Its 
primary purpose shall be to further higher standards in statistical fields and there¬ 
by enhance the status of statisticians in the mind of the public. 


Report of the Treasurer 

The 1948 budget as ori^nally approved by the Board of Directors planned for 
an income of $61,900 and expenses of $58,925. Actual income was $51,320 and 
expenses were $64,605. In September, the Board voted a revised budget providing 
for the increase in expenses and decrease in income. 

Income was less than expected primarily because some 600 more members than 
had been estimated left the Association during the year. Also there were some 
275 fewer new members than estimated, and receipts from subscriptions, sales, 
and advertising were some $4,000 less than estimated. Income for 1948 exceeded 
that for 1947 by $14,315, nevertheless, because of the additional revenue derived 
from higher dues and subscription rates. 

Expenses were somewhat higher than anticipated because of increases in 
printing costs. Printing costs will be still higher in 1949. 

The national office plans to continue its drives for new members and to make 
a new drive for additional advertising and subscriptions. Through this expansion 
the Association will widen its sphere of influence and increase its services, and 
at the same time improve its financial status. 


Merrill M. Flood, Treasurer 



308 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


Report of the Avdiiors 

To the Board of Directors of 

American Statistical Association 

We have examined the attached financial statements of American Statistical 
Association relating to the year ended December 31, 1948. Our examination was 
made in accordance with generally accepted auditing standards, and accordingly 
included such tests of the accounting records and such other auditing procedures 
as we considered necessary in the circumstances. 

The recorded cash receipts for the year were traced to the deposits shown on 
the bank statements and the amounts for dues and subscriptions were tested 
with the membership and subscription records. The paid checks were inspected 
and related vouchers tested in support of cash disbursements for the year. The 
bank balances were reconciled with amounts reported direct to us by the de¬ 
positaries and the cash on hand and the securities owned at December 31,1948 
were verified by inspection. We did not check the membership and subscription 
records in detail or make any independent verification of the inventory of old 
Journals, the office records of which are based, in part, on data assembled in prior 
years, no recent physical inventory having been taken. 

The life membership reserve at December 31,1948 reflects the amount needed 
to support a life annuity for each life member in the same annual amount as that 
which could have been purchased by the original lump sum payment, based 
on a 2i% interest rate, the 1937 Standard Annuity Table and the age of the 
life member when the lump sum payment was made, in accordance with a resolu¬ 
tion of the Board of Directors adopted pursuant to a mail ballot in January 1949. 
Previously the reserve had been calculated on the basis of the combined annuity 
table of mortality with assumed interest at 4% per annum and an assumed 
annuity of 85 per member. The amount treated as income from life memberships 
in 1948 represents the excess of the reserve at the beginning of the year over the 
required reserve, on the new basis, at the end of the year. 

The accounts for the 3 ^ear 1948 include for the first time a provision for em¬ 
ployees* accrued annual leave. The reserve of $1,209.54 provided at December 31, 
1948 includes $506.50 applicable to the prior year. We understand that this 
provision was made at the direction of the Secretary. 

In our opinion, the accompanying statements present fairly the position of 
American Statistical Association at December 31, 1948 and the results of its 
operations for the year, in conformity with generally accepted accounting prin¬ 
ciples applied on a basis consistent, except as mentioned in the two preceding 
paragraphs, with that of the precedii^ year. 

Price, Waterhouse & Co 

Washington, D. C. 

April 27, 1949 



PROCEEDINGS OF lOSTH ANNUAL MEETING 309 

American Statistical Association Balance Sheet 

December Sl^ 

IQJfi 1947 

Cash in bank and on hand. S 7,513.00 $ 1,853.91 

Accounts receivable. 1,204.97 1,356.22 

Investments: 

United States Savings Bonds, Series D, at redemp¬ 
tion value. 6,406.00 6,138.00 

Stocks, at cost (at market quotations $8,219 and 

$6,150, respectively). 5,793.50 5,793.50 

Inventory of old Journals, at approximate cost. 1,907.51 2,024.63 

Furniture and equipment, at cost less depreciation.. 2,892.19 2,482.76 

Deferred charges. 277.80 

$25,994.97 $19,649.02 

Liabilities 

Accounts payable. $ 4,631.43 $6,515.76 

Note payable. 5,000.00 

Accrued interest. 37.50 

Accrued annual leave. 1,209.54 

Deferred income (collections applicable to subse¬ 
quent year): 

Dues. 15,964.53 1,054.50 

Subscriptions. 3,797.59 2,704.76 

Other. 115.35 

$30,640.59 $10,390.37 

Net worth: 

Life membership reserve. $ 3,131.40 $3,750.20 

Surplus, per statement. (7,777.02) 5,508.45 

$(4,645.62) $ 9,258.65 

$25,994.97 $19,649.02 

Surplus Statement Year ending December 31^ 

1948 1947 

Balance at beginning of year. $ 5,508.45 $ 4,248.79 

Add —^Transfer of balance in Centenary Sustaining 
Fund. 6,379.51 

$ 5,508.45 $10,628.30 

Deduct —^Excess of expenses over income for the year, 
per income statement. 13,285.47 5,119.85 

Balance at end of year. $(7,777.02) $ 5,508.45 






















310 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


Amebigan Statistical Association 
Income Statement 

Year ending Deceinber 31, 
1948 1947 

Income: 

Dues—current year. $ 32,495.90 $21,298.58 

Dues—^prior years. 54.65 215.00 

Life membership income. 418.80 397.33 

Subscriptions. 7,386.28 6,206.95 

Advertising. 1,728.77 2,010.08 

Reprints. 892.09 509.00 

Journal sales. 2,702.19 2,827.80 

Biometrics Section income. 4,212.70 2,466.86 

Miscellaneous. 724.15 684.70 

Dividends and interest. 704.50 398.57 


$ 51,320.03 $37,014.87 


Expenses: 

Journal—^printing, mailing and reprints. $ 12,160.70 $10,105.81 

Salaries and wages, including in 1948 $506.50 

accrued annual leave expense applicable to 

prior year. 30.582.81 17,037.26 

American Statistician Bulletin. 5.875.55 4,259.88 

Biometrics Section expenses. 4,316.29 2,366.46 

Rent. 2,880.00 1,365.00 

Office supplies, printing and mimeographing... 1,599.27 1,764.42 

Postage. 1,748.74 1,511.11 

Telephone and telegraph. 754.81 512.03 

Travel expense—officers. 649.20 603.91 

Annual meeting expense. 613.31 647.89 

Depreciation of furniture and equipment. 442.86 324.99 

Promotion expense... 743.28 

Loss on disposal of equipment. 151.09 

Storage of old Journals. 71.95 72.00 

Cost of old Journals sold. 573.10 382.27 

Miscellaneous. 1,593.63 1,030.60 


$ 64,605.50 $42,134.72 


Balance, loss, carried to surplus. $(13,285.47) $(5,119.85) 





























BOOK REVIEWS 

Edited by 

Oscar Krtsen Buros 
Rutgers University 

Actuarial Statistics: VoL II, Constructioii of Mortality and Other Tables. J, L. 
Anderson (Scottish Widows' Fund and Life Assurance Society, 9 St. Andrew 
Square, Edmburgh, Scotland) and J. B. Dow (Standard Life Assurance Co., 3 
George St., Edinburgh, Scotland). Published for the Institute of Actuaries and 
the Faculty of Actuaries. London N.W. 1: Cambridge University Press (Bentley 
House, 200 Euston Road), 1948. Pp. xvi, 281. 21s. 

Review by T. N. E. Grevillb 
Chiefj Actuarial Analysis Branch, Public Health Service 
Federal Security Agency, Washington 25, D, C. 

T his book has been written primarily to assist British actuarial students 
in their preparation for the examinations of the Institute of Actuaries 
and the Faculty of Actuaries. As might be expected, the main emphasis is 
on the construction of mortality tables from the records of insured lives and 
life annuitants. However, the two chapters on the construction of national 
life tables at least acquaint the reader with the more important problems 
that usually arise in that connection; and the final chapter on sickness rates 
is also very informative to the uninitiated. 

The principal topic, the construction of mortality tables from the records 
of insured lives, is rather fully discussed, and there is no doubt that this is 
the most complete treatment of the subject thus far published. The very 
detailed explanation of exposed-to-risk formulas will be especially helpful. 
Throughout the book the exposition is lucid and accurate. Clarity is assisted 
by numerous illustrative examples, which add greatly to the usefulness of 
this text. The absence of problems to be worked out by the student is a 
shortcoming which it has in common with other British actuarial textbooks. 

It is natural that the principal emphasis should be placed on develop¬ 
ments in Great Britain. However, after making all due allowance, it is still 
the impression of the reviewer that the book suffers somewhat from a certain 
insularity of approach. Certain useful contributions have been made by 
actuaries in the United States and Canada which, it is suggested, could most 
appropriately have been mentioned in a book purporting to deal compre¬ 
hensively with a branch of scientific methodology. One is surprised, for 
example, to find no mention of abridged processes of life table construction, 
a subject in which there have been important developments on this side of 
the Atlantic, but in which British demographers have expressed considerable 
interest. Again, on page 255, where it is pointed out that occupational mor- 

311 



312 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE ld49 

tality statistics based on national population data "are not reliable guides 
for a life office to use for assessing extra premiums for occupation,” some 
mention might have been made of the extensive studies of occupational mor¬ 
tality among assured lives in the United States and Canada made by the 
Actuarial Society of America and the Association of life Insurance Medical 
Directors. 

This is, of course, a minor criticism; and it is the reviewer’s opinion that 
the authors have, on the whole, handled their material in expert fashion and 
have produced a text which will be most useful not only to the students for 
whom it is intended but also to any others having an interest in the subject 
matter. 


Experimental Designs in Sociological Research. F. Stuart Chapin (Professor of 
Sociology, Chairman of the Department, and Director of the School of Social 
Work, University of Minnesota, Minneapolis, Minn.). New York 16: Harper & 
Brothers (49 East 33rd St.), 1947. Pp. xi, 206. $3.00. 

Review by Margaret Jarman Hagood 
Statistician, Division of Farm Population and Rural Life 
Bureau of Agricultural Economics, Washington B6, D, C. 

I N THIS brief volume. Dr. Chapin has summarized the designs and findings 
of nine sociological "experiments” and, in addition, has presented a con¬ 
siderable amount of general exposition of experimentation in sociology and 
of sociometric scales. As Chapin points out, the work is complementary to, 
rather than in competition with, Ernest Greenwood's Experimental Sociol¬ 
ogy, which appeared two years earlier. Greenwood treated the subject with 
primary attention to the conceptual formulation and logical principles in¬ 
volved, with a critical examination of the literature. Chapin, on the other 
hand, had as his purpose to "illustrate the method of experimental design 
by reproducing concrete studies” and to provide "a source book of examples 
of specific application analyzed in some detail” (p. ix). 

Dr, Chapin has fulfilled this purpose excellently—^in fact, no other person 
in the United States is in position to fulfill it equally well. To Chapin goes 
the credit for leadership among sociologists for trying to adapt the experi¬ 
mental approach to sociological research problems and for decades of work 
and encouragement of work of others in this field. He and his students have 
made outstanding advances in quantification of hitherto unmeasured social 
phenomena and in continuously carrying out loneering work to establish 
sociology as a science. 

Without reflection on the author, one can differ with him, as to when the 
term "experiment” or "experimental” is to be validly used. I am inclined to 
be much more restricted than is Chapin when it comes to classifying a social 
research project as an "experiment.” The social "experiments” reported in 
his book do not measure up to the criteria that I hold for "experiments.” 



BOOK REVIEWS 


313 


But I do agree with Chapin that in our held, the research worker is unrealis¬ 
tic w^ho writes as though one could control the "treatments,” and provide 
for random selection of the units which are to receive "treatments,” as in 
the case of biology and in some areas of psychology. 

R. A. Fisher's methods of analysis of variance and covariance have not 
yet been fully explored in comparison with Chapin's matching technique 
which sacrifices much data. Dr. Chapin is aware of this and it is hoped that 
he will focus research on the problem in the future. 


Elements of Mathematical Statistics. C. V. L. Charlier. Including Table of Pois¬ 
son’s Function by L. v. Bortkiemcz, Edited and translated by A, Greenwood. 
Brookljm 25, N. Y.: J. A. Gieenwood (25 Winthrop St.), 1947. Two redews 
follow: 

Review by Burton H. Camp 
Emeritus Professor of Mathemodics, Wesleyan University 
Middletown, Connecticut 

T his well-known book, written in German and published in Hamburg in 
1920, is now available in an English translation, and it is this translation 
which is the subject of this review. The translator has also made occasional 
emendations of the text and an improvement in the tables. He himself 
has computed new tables for 4 > 2 i <i>z and <l> 5 , these functions being derivatives 
of the function defining the normal curve; he has made some corrections in 
the original tables; and has added a new table, No. 44. The meanings of 
Tables 43 and 44 are not indicated in the appendix where the tables occur. It 
can be learned from the preface that Table 43 is the table of Bortkiewicz, 
that is, a table of the Poisson function, the reader being left to infer the 
meanings that must be attached to the numbers at the top and side of the 
table. There is no easy way of finding out what Table 44 does mean, but 
the thorough reader will find it explained in a footnote on page 60. 

So far as the reviewer can judge the translation is sufficiently accurate. 
The English is often awkward but not without a quaint charm so that per¬ 
haps by its very defects the phraseology makes one the more aware of the 
originality of the distinguished author who is thus being presented to us. 
In fact one would read Charlicr's book now more to become acquainted 
with Charlier's way of looking at things rather than to study the material 
which the book contains.,This material has been worked over many times 
since 1920 and has been >,r^ented in various forms in several books; but 
there is much value nevertheless in going to the original source, and this 
translation makes it possible for the English reader to do this. The general 
scope of Charlier's book may be indicated as follows. First of all there is a 
discussion of homograde or alternative statistics. These are statistics show¬ 
ing the number of times an event occurs, in a given number of trials, when 
it may either occur or not occur. This sort of frequency distribution may be 



314 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


approximated by successive terms of tbe point binomial, in certain cases, 
and in other cases by the modifications of the point binomial introduced by 
Poisson and by Lexis. Heterograde statistics comprise frequency distribu¬ 
tions that are now sometimes called continuous. These may be approxi¬ 
mated sometimes by the normal probability curve, sometimes by the more 
general A-Type or B-Type curve. The subject of simple correlation is con¬ 
sidered at some length, and the discussion includes what Pearson calls 
tetrachoric correlation. 

Beview by Alexander M. Mood 
Statistician, Project BAND, Douglas Aircraft Co,, Inc. 

1500 Fourth Street, Santa Monica, California 

r ? IS always interesting to read an older book and compare the state of 
statistics then with now. The present little volume was written in 1910 
(the translation was made from a 1920 German translation) just two years 
after Student's famous paper was written and before the full import of that 
paper was recognized. Fisher’s paper on the foundations of statistics was 
not to appear for ten more years. 

The book is divided into two parts of about fifty pages each; the first part 
discusses what would now be called discrete distributions, and the second, 
continuous distributions. The discussion is largely in terms of the first 
four moments of the distributions and of course in the second part there 
is considerable attention to the Gram-Charlier system of frequency func¬ 
tions. Actually there is much in the book that can be found in the best sellers 
in statistics today, but unfortunately that fact cannot be interpreted as a 
compliment to Charlier’s sagacity—^it is merely an indictment of our best 
sellers. I refer to such things as the detailed computational instructions in 
terms of class frequencies all outmoded by modem computing machines, the 
hoary errors about skewness and kurtosis still fed our students today, the 
confusion between population parameters and their estimates, estimation 
by the method of moments only, the great faith in the standard error whether 
the distribution is normal or not. But Charlier, at least, cannot be much 
blamed for these shortcomings because they were either not relevant or had 
not been pointed out at the time. 

The real trouble with the book is simply that it was written before the 
Fisherian revolution in statistics. It was written in that period when sta¬ 
tisticians were trying to develop and formalize statistical inference in terms 
of relatively simple classes of frequency functions—^the Pearson curves and 
the Gram-Charlier series. Looking back, it is easy enough to see that such a 
course was actually a blind alley, that a fitted curve contains no more in¬ 
formation than the data to which it is fitted. But at the time that approach 
must have looked promising and Charlier did significant work in the field. 

Today the book is of little interest except possibly to students of the 
history of statistics, and the reviewer is at a loss to rationalize the presen¬ 
tation of this translation. 



BOOK REVIEWS 


315 


Elements of Nomography. Raymond D, Douglass (Professor of Mathematics) 
and Douglas P, Adams (Assistant Professor of Graphics). (Massachusetts Insti¬ 
tute of Technology, Cambridge, Mass.) New York 18: McGraw-Hill Book Co., 
Inc. (330 West 42nd St.), 1947. Pp. ix, 209. $3.50. [London W.C. 2: McGraw- 
Hill Publishing Co., Ltd. (Aldwych House, Aldwych), 1948. 21s.] 

Review bt Joseph Ztjbin 

Associate Research Psychologist^ New York State Psychiatric Institute 
722 West mth St, New York, N. Y. 

T his is an elementary text of nomography for students in engineering 
which may serve very well as an introductory text for any student of 
statistics who wishes to know how nomographs may be constructed. From 
the point of view of the statistician it is unfortunate that no direct applica¬ 
tions of nomography to statistical problems are made. Perhaps that is a 
task for a future nomographer who has statistics well in hand. This simple 
elementary text ought to prove useful for those students who wish to be¬ 
come acquainted with the use of nomographs in their field. 

The more routine tasks of computation such as comparison of percentages 
and comparison oj( correlation coefficients with their standard errors are 
more readily accomplished by means of nomographs than by calculating 
machines and tables. There are several source books in which such nomo¬ 
graphs are provided but the manner of their construction and limitations on 
their use is not always understood by the average statistician. Furthermore, 
no new statistical aids of this type can be invented unless statisticians be¬ 
come aware of the problems involved in nomograph construction and of the 
areas in which they may prove useful. The simplicity of the present text 
and its empirical approach ought to win adherents for the use of nomo¬ 
graphs on a wider scale. 

A Guide to Public Opinion Polls, Second Edition. George Gallup (Director, Ameri¬ 
can Institute of Public Opinion, Princeton, N. J.). Princeton, N. J.: Princeton 
University Press, 1948. Pp. xxiv, 117. $2.50. [London E.C. 4: Oxford University 
Press (Amen House, Warwick Square). 14«.] 

Review by Robeet Cobb Myers 
Educational Testing Service, Princeton, N. J. 

T his is a revised edition of the question-and-answer handbook which 
was first prepared in 1944. The avowed purpose of the book is ‘‘to 
answer, in non-technical language, the questions that people most frequently 
ask about public opinion polling.” If non-technical can be considered as 
synonymous with naive, unsophisticated, incomplete, repetitious, and con- 
tradictory, we can then concede that the author’s purpose in this respect has 
been fulfilled. 

Many popular misconceptions to the contrary, ndther the American In¬ 
stitute of Public Opinion nor the publisher, the Mncetjn University Press, 
is a part of Princeton University. Each, however, is boujid to the university 



316 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

by many ties. Although the AIPO is a privately owned commercial organi¬ 
zation, its director is listed in the Princeton University catalogue as a mem¬ 
ber of the Advisory Councils of the Psychology Department and of the 
School of Public and International Affairs. The University’s Office of Public 
Opinion Research serves as an official depository and reference guide for the 
end products of all AIPO, or Gallup, surveys—^these archiving activities 
being under the direction of Princeton’s Professor Hadley Cantril, and fi¬ 
nanced by the Rockefeller Foundation. In addition, the book. Gauging 
Ptiblic Opinion, by Cantril and various of his graduate students, which was 
first published in 1944 by the Princeton University Press, carries a formal 
dedication to George Gallup. The situation is, therefore, understandably 
confused; and one is often at a loss to know just where Gallup and his AIPO 
leave off and where Princeton University begins. However, despite this 
confusion, if A Guide to Public Opinion Polls is properly regarded simply 
as a popular publication by a business owner-manager about one of his sev¬ 
eral enterprises then it is subject to fewer strictures than could be levelled 
against it if it were otherwise considered. There is, after all, little justification 
for criticizing a book for not being scholarly or objective when it was never 
intended by the author that it contain these qualities. 

The book, having been published prior to last November’s presidential 
Section, presents answers to the author’s sdf-posed questions in a most 
assured, if not supercilious, fashion. The curious reader will find no mention 
of an “infant science” which latterly has become so popular a catchword 
with the commercial pollsters and thdr apologists. Critics of Gallup’s pro¬ 
cedures, when mentioned at all, are dismissed as being irresponsible, unin¬ 
formed or undemocratic. Their criticisms are rationalized away either by 
retreat to a reverent quotation from James Bryce or by some breezy non 
sequitur. For example, Komhauser’s (not mentioned by name) careful ex¬ 
position and criticism of the antilabor bias of AIPO from 1940 through 1945 
is met by this reply: “Carried to the extreme, it would have meant that every 
time survey results showed the public hostile to a Hitler, another survey 
had to be found which was favorable to him.” 

Gallup gives an earnest defense of his “quota” method of sampling as com¬ 
pared to “area” or “probability” sampling. He also gives much lip service to 
statistical theory and Bemouillian “laws of probability” in discussing ap¬ 
plicable formulae for determining size of sample in relation to probable 
error; and he compliments Theodore Brown and Samud S. Wilks for the 
hdp they have given pollsters by “their studies on the statistical problems 
involved in sampling.” The lay reader might well gain the impression that 
a considerable body of statistical theory underlies Gallup’s “quota” opera¬ 
tions, but the inquiring statistician will be rdieved to find this admission on 
page 30: “While the accuracy of quota sampling cannot be determined from 
mathematical formulae—^prindpaliy because of the inability to calculate 
the interviewer selection factor—^there does exist a growing record of per¬ 
formance of public opinion polls which have in the past been operating on 
quota sampling procedures.” 



book reviews 


317 


An entirely new section is devoted to reprinting a paper entitled “The 
Quintamensional Plan of Question Design.” This was first delivered by 
Gallup in London before a meeting of the executives of his foreign affiliated 
enterprises, and later published in the Public Opinion Quarterly. The space 
consumed by this is one-thirteenth of the entire text of the book. One 
wonders at its disproportionate inclusion among “questions people most 
frequently ask about public opinion polls.” It is difficult to imagine a deluge 
of letters and telegrams inquiring: “What is the quintamensional plan of 
question design?” More pertinent, however, would be an answer to the ques¬ 
tion concerning what proportion of AIPO surveys on controversial issues ever 
make use of this five-point plan involving the use of (a) filter or information 
questions, (6) open or free answer questions, (c) dichotomous or specific 
issue questions, (d) reason why questions, and (c) questions bearing upon 
intensity of opinion. The book gives us no clue as to the extent to which this 
laudable plan is actually used. On the surface it would seem too expensive 
for more than token use in a commercial operation. 

A considerable compendium might be composed of important questions 
regarding opinion polling which are neither posed nor answered in this 
book. Some of these unanswered questions have become far more pressing 
than heretofore in view of the rent through the clouds which occurred last 
November, and many are currently being answered. But an extended dis¬ 
cussion of these matters is not particularly apposite in a review. 

The title is as misleading as that given to the first Xinsey report. Not 
only is this not a guide to opinion polls; it is not even a guide to all of Gallup^s 
own polling activities—^under whatever corporate names. Unfortunately 
for such organizations as Chicago’s NORC, Michigan’s Survey Research 
Center, and Columbia’s Bureau of Applied Social Research, the public is 
all too likely to take the title at its face value. 

If you want to learn in the shortest possible time the most complimentary 
things that Gallup has to say about the operations of his American Institute 
of Public Opinion, then this is the book for you. 


The Decomposition of a Series of Observations Composed of a Trend, a Periodic 
Movement, and a Stochastic Variable. A. Bald (Assistant Professor, University 
of Copenhagen, Copenhagen, Denmark). Copenhagen, Denmark: G. E. C. Gads 
Forlag, 1948. Pp. 134. Paper. 

Review by 

D. B. DbLtirt, Director j Department of Mathematical Statistice 
Ontario Research Foundation, Toronto 5, Canada 

AND 

Boyd Harsbbarger, Professor of Statistics 
Virginia Polytechnic Institute, Blacksburg, Virginia 

T he title of this work indicates clearly its content. The development 
concerns a series of observations ywx{v — l, 2 • • • n; 2 • • • ft), with 
measurements of a concomitant variable aud assumes an imderlying 



318 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

model of the form 


Vvi = + €*«, 

where T denotes a trend, ri a periodic element and e a random component- 
The trend is assumed to be representable by a polynomial. 

The fitting of this model to data is accomplished by constructing orthog¬ 
onal polynomials in x (for the case in which the recorded a;-values are in 
arithmetic progression and the y’s have equal weights), such that, after 
elimination of the the normal equations have diagonal form. Thus, the 
trend may be fitted one term at a time until a ^satisfactory” fit is reached. 

These orthogonal polynomials are, of necessity, considerably more com¬ 
plicated than those appropriate to fitting a trend only. Each pair of (n, h) 
values requires it own set of polynomials. Hence a table of values of these 
polynomials must be much more elaborate than that given, for example, 
in Fisher and Yates’ tables. Part III of this study contains tabulated values 
of the polynomials up to degree 5 for the following (n, k) pairs. 

ft = 3 n = 5, 7 

/k = 4 n = 25 

4 = 5 n = 5, 7, 8, 9, 10, 11 

Jk = 12 n = 10 

From the several fidds in which this mathematical model might concmv- 
ably be appropriate, the author selects two for special mention, the treat¬ 
ment of economic time series and the analysis of agricultural field experi¬ 
ments. The decision on the usefulness of the model in dealing with economic 
series must be left to economists. Its value in the design and analysis of field 
experiments will surely be widely questioned. 

The agricultural experiment chosen to illustrate an application of these 
methods is called a “row experiment,” in which the plots are placed con¬ 
secutively in a row and divided into blocks, so that each treatment occurs 
once within each block. The arrangement of treatments within blocks is the 
same for all blocks. The z^s then refer to the positions of the plots and the 
ri's are treatment parameters. 

One could, as the author remarks, employ this model when the treatments 
are arranged in any manner whatever, but when the “periodic” arrangement 
is abandoned, orthogonal polynomials are no longer feasible. In this case, 
regression analysis (presumably using ordinary orthogonal poljmomials) or, 
what amounts to the same thing, a covariance analysis with the position of 
the plot as concomitant variable, permits the use of the same mathematical 
model. Even when the treatments are randomly arranged within blocks, 
control of the error resulting from variation in fertility is possible by these 
means and such procedures have, in fact, been used a number of times. It 
is true that the computational labor is considerably greater when orthogonal 



BOOK BEVIEWS 


319 


pol 3 niomials, of the sort developed in this book, are not available, but then, 
when random arrangements are used, there is good reason to expect that 
such refinements will not be needed and that a simple analysis of variance 
will suffice. On the other hand, if one adopts a periodic arrangement of treat¬ 
ments in the hope that fertility variations are expressible in terms of a 
polynomial of low degree, his computational program is dismal indeed if 
this hope is not fulfilled. Furthermore, it is not clear to the reviewer that 
randomization can be dispensed with without assurance that variation in 
fertility is completely accounted for by the fitted polynomial (a viewpoint 
probably held also by the author, see p. 91), but presumably it never is. 

Whether or not this mathematical model is suitable for routine use in 
field experiments, it seems likely that there are many situations in which 
it is quite appropriate and a table of orthogonal polynomials to facilitate the 
computations it entails is a useful item of statistical equipment. It is to be 
hoped that a more complete table than the one included in this book will 
soon be issued. 

Part I, devoted to theory, includes a survey of standard regression theory 
and applies these classical methods to the problem at hand. The requisite 
orthogonal polynomials are developed and the necessary distribution theory 
is derived. The method of moving averages is discussed with respect to data 
to which this mathematical model is applicable and some comparison is 
made between this method, the type of regression analysis here developed 
and the analysis of variance as applied, for example, to a randomized block 
experiment. The methods used in this section are direct and the exposition 
is clear and concise. 

Part II presents three worked examples, the fitting of a time series, the 
analysis of a row experiment and a re-working of the data of an industrial 
example reported by H. E. Daniels.* 

The trend in the time series is fitted to degree 6 without reaching a good 
fit. The polynomial representing fertility variation in the row experiment 
is terminated at the fourth degree, on the ground that the coefficient of the 
fifth degree polynomial is not significant—a questionable criterion for gen¬ 
eral use. The same reason is given for choosing a third degree polynomial in 
the industrial example. Regression analysis in this last case comes out at 
about the same place as does the analysis of variance used by Daniels. Some 
advantage is claimed for the regression approach, in that only 3 degrees of 
freedom are needed to accomplish what 24 are used for in the analysis of 
variance. 

Part II contains also a short but excellent discussion of some of the theo¬ 
retical considerations on which experimental design is based. 

This work is well put together and written in a concise and straightfor¬ 
ward style that makes for easy reading. It can be recommended to those who 
(a) would like to have a concise, accurate statement of the methods and 


* *Some Fkoblems of Statistical Intcrrat m Wool Besearch ” J BoyoZ Stat Soe 5:89-112 *38. 




320 


AMEMCAN STATISTICAIi ASSOCIATION JOURNAL, JUNE 1949 


principal results of regression theory; (6) are interested in the theory of 
moving averages; (c) may find appropriate the mathematical model to which 
the book is devoted (e.g., in engineering research); (d) are perplexed about 
the old controversy over systematic and random arrangements, even^though 
the concept of randomness is not dealt with as discerningly as one might 
wish. 

Quality Control Methods. Clifford W. Kennedy (Quality Control Engineer, Fed¬ 
eral Products Co., 1144 Eddy St., Providence, R. I.). New York 5: Prentice-Hall, 
Inc. (70 Fifth Ave.), 1948. Pp. vii, 243. $4.75, trade edition; $3 55, text edition. 
Two reoiewB foUow: 


Review by Sebastian B. Littaubr 
Associate Professor of Industrial Engineering 
Columbia University^ New York B7, N, Y. 

F ollowing the extensive as well as intensive application of statistical 
quality control during the war, the appearance of a number of texts 
at a variety of levels of interest and with a diversity of emphasis was to 
be expected. Of the dozen or so of books which have now been published, 
Quality Control Methods seems to be the first which is admittedly addressed 
to shop operating personnel, and therefore is to be viewed in that light. 
One might, on that account, reasonably have anticipated a greater emphasis 
on operating procedures, practical examples, administrative and economic 
problems, and in general the important nonstatistical aspects of quality 
control which have yet to be adequatdy presented. The fact, however, is 
that this work is devoted primarily to reporting statistical aspects of quality 
control, supposedly to fill the gaps left by other authors’ omission of ‘'basic 
details and primary concepts.” 

The role of statistics in quality control is introduced by a general discus¬ 
sion of sampling and the presentation of a variety of sampling plans including 
some dozen pages on sequential sampling. Following a discussion of “batch 
control” in Part 2, some elements of statistics are taken up in Part 3. Control 
charts for variables are considered in Part 4, and the closing section is de¬ 
voted to administrative problems. 

Statistical quality control can be motivated naturally by introducing 
the principles of acceptance sampling which can, again, lead naturally into 
control chart procedures. This has been effectivdy done by another author. 
In the present work, however, acceptance sampling procedures are presented 
apparently because “the progressive business house establishes a receiving 
system designed to make certain that the goods delivered fulfill requirements 
before the invoice is paid.” The presentation of the various sampling plans 
is not prefaced by any exposition of statistical principles nor is their practical 
use validated by bringing in the role of staiistical control in acceptance 
sampling practices. The exposition is necessarily descriptive and follows from 



BOOK REVIEWS 


321 


one plan to another, single, double and sequential, with little foundation 
that can be called statistical. It seems questionable, at best, to devote so 
much space to an attempted simplified explanation which ^follows closely 
the instructions in Sequential Analysis of Stalistical Data: Applications” 
when the readily available S.R.G, work does the job so well and so thor¬ 
oughly. 

The present treatment neglects to show the natural place of acceptance 
sampling in statistical quality control viewed both as a system of thought 
and as a system of practices. The author seems to confuse estimation with 
testing hypotheses. For example, early in Part 1, page 12, the author advises: 
“Throughout the use of sampling methods, never lose sight of the fact that 
basically you are attempting to estimate or judge the whole by what you see 
and think from examining a part or a sample.” While this is altogether 
wholesome advice it does not apply to the use of the sampling plans pre¬ 
sented, and, in fact, the originators of these plans would argue quite strongly 
against this practice. In a number of other places the author makes refer¬ 
ence to estimating either the “condition” of the lot, or the quality of the lot. 

Many statements made by the author must be read with great care in 
order to avoid misinterpretation. On reading (p. 7) that, “A sample is de¬ 
fined by the dictionary as a part shown to prove the quality of the whole,” 
this reviewer checked some six dictionaries. The closest approach to the 
quoted statement was found to be, “A part of anything presented as evidence 
of the quality of the whole,” from Webster^s Collegiate Dictionary (Merriam 
and Co., 1917). Perhaps a statistical dictionary is needed. Again (p. 31), 
“The sample size n may be a number of ounces or pounds, a proportion of a 
fifty-ton gondola of ore, for example, or it may be so many drops of liquid, 
or 100 persons interviewed during a public opinion poll in a certain city” 
does not say what it may have been intended to mean. Nor can the statement 
(p. 68) “In other words, the chances would be 95 per cent good (with the 
buyers' risk from error in sampling, p, at 5 per cent) that the lot was of 
quality equal to pi(pi = .05), or better” be regarded as correctly interpreting 
the use of a Wald sequential plan. In numerous other places in Part 1 the 
text requires discriminating reading that it is not likely to get from shop 
personnel who have had no previous statistical training. The whole treat¬ 
ment of acceptance sampling might have been clarified and simplified by 
use of OC curves. As it stands, the exposition is misleading to the un¬ 
initiated. 

The remainder of the book is conventional except for the use of shop 
language. In the author’s attempt to by-pass the preciseness of statistical 
inference for the suggestiveness of intuitive insight he has inadvertently 
exposed his whole exposition to question. Thus there is never a clear dis¬ 
tinction between parameters and statistics, which leads to confusion between 
control limits and confidence intervals in the discussion of p-charts. This 
may also account for the absence of a clear-ctit explanation of and emphasis 



322 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1049 


upon the concept of statistical control, stressing the importance of time order 
and rational subgrouping. There are no exercises for the student to work on 
and few adequate illustrative example. In spite of the six practical prob¬ 
lems in the last chapter, one might have hoped that this author would have 
embellished his writing with more material taken from his extensive experi¬ 
ence. 

It is rather surprising to this reviewer that there is no reference in the 
body of the text or in the index to Shewhart's contributions, and only a 
belated reference to his earlier book in the brief bibliography. In spite of all 
the space (26 pages) devoted to sequential sampling, Wald’s name does not 
appear anywhere. On page 102, there is a replica of a table from page 23 
of Introdudion to Industrial Statistics and Quality Control by Paul Peach, 
with no specific acknowledgment. It is customary to acknowledge any com¬ 
plete reproduction apart from a general reference in the bibliography. Is it 
not the responsibility of the publisher to supply careful editing in order to 
pick up omissions like those mentioned. 

Review by Charles R. Scott, Jr. 

Superintendent, Plant No. 1, SKF Industries, Inc. 

Front St. and Erie Ave., Philadelphia S4, Pa. 

T his is an informative book about industrial quality control. The author 
handles the subject in a way that a beginner can understand. Rather than 
attempting to write a handbook putting forth a series of tables to be used in 
a mechanical sort of way, the author attempts to enable the beginner to se¬ 
cure a reasonable grasp of the statistical theory underlying industrial quality 
control. A discussion of the methods used in modem industrial plants carries 
the reader along through the theoretical sections of the book. The text is 
directed toward the practical man interested in statistical application of 
quality control at the operating leveL 
The book is divided into five sections. Part I discusses acceptance sam¬ 
pling, running through the gamut of sampling tables, sampling size, and the 
various accepted sampling methods. Part II discusses batch control and the 
practical methods of securing control by attribute inspection. Part III dis¬ 
cusses and outlines practical uses of the frequency distribution and the 
standard deviation. In Part IV, Kennedy discusses average and range and 
compares the standard deviation with the average-range method. Of great 
importance is the way the author explains how to read a chart and interpret 
the data collected. Part V very nicdy winds up with some good common- 
sense suggestions on what to do with the knowledge gained from the 
book. The reader is not left confused as to how to apply practically what he 
learned. The commercial examples form a basis for practical application. 
The book includes a bibliography, an appendix, and an index. 



BOOK REVIEWS 


323 


Metoder att Uppskatta Noggrannheten vid Linje- och Provytetaxering. [Meth¬ 
ods of Estimating the Accuracy of Line and Sample Plot Surveys.] Bertil Mai&m 
(Student in the Institute of Mathematical Statistics, Stockholm, and Research 
Statistician in the State Forest Research Institute, Stockholm). Stockholm, Swe¬ 
den: State Forest Research Institute, 1947. Pp. 138. Paper. 

Review by T. W. Anderson 
AssiBtant Professor of Mathematical Statistics 
Columbia University^ New York 27, N. Y, 

T his book consists of a study of the use of information in a survey of a 
certain type for the estimation of the accuracy of that survey. This 
kind of systematic sample survey of a region is used, for example, in es¬ 
timating certain characteristics of forests. The author utilizes the theory of 
stochastic processes in his treatment of the use of quadratic forms in observed 
variables for the estimation of the variances of estimates of various charac¬ 
teristics. The mathematical results are applied to a variety of problems 
arising in forestry survejrs. 

The kind of survey to which most attention is paid is a completely sys¬ 
tematic sample. A specified grid of survey lines, say q, is placed in a specified 
way on the region Q, which is to be surveyed. The average of the character¬ 
istic on the grid lines, say f{q), is the estimate of /(Q), the average of the 
characteristic over the region. For instance, the estimate of the proportion 
of the area of Q which is forested is the proportion of the lengths of the 
survey lines running through forests. It is customary to estimate the 
“variance” of the error, f(q) —fiQ), by means of a quadratic form in the 
values of the characteristic on pieces of the survey lines. The principal topic 
of this investigation is the selection of a quadratic form which will give a 
good estimate of the variance of the error regardless of the topographical 
variation. 

The author suggests that a stationary stochastic process defined over a 
plane be used as a probability model for this problem. To each point (u, v) of 
the plane we attach a random variable f{u, v) such that the set of random 
variables has the properties that the expectation of f{u, v) is m and the vari¬ 
ance is <r* and that the correlation of /(wi, vi) and /(u 2 , ^ 2 ) is p{t), where t 
is the distance between (ui, vi) and (u 2 , v^. If p(0 is assumed continuous, we 
can define a random variable attached to a region as the integral of /(u, v) 
over the region (with convergence meaning limit in the mean) divided by the 
area of the region. For example, f{Q) is the integral of f(u, v) over Q divided 
by the area of Q. The random variable associated with a line is a line in¬ 
tegral of f(u, v) divided by the length of the line. The expected value of a 
certain kind of quadratic form in such variables can be written as 

p(t)a(t)dt, 

where ait) depeuds on the regions or lines involved. 



324 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

The accuracy of a survey is indicated by the expected value T} , where 
T «L[f(g) -/(Q)]® and L is the length of q. The author finds an approxima¬ 
tion to E{ t] and an associated “distance function” d(t)- The expected values 
of different quadratic forms used to estimate T} are studied by comparing 
the associated distance functions with d(t). Certain quadratic forms are sug¬ 
gested as best because their expected values are near E{t} for all p(t) in a 
certain class. 

The remainder of the book deals with some questions of the relation of the 
model to reality and with the application of the theory to practical problems. 
Plot surveys, sampling trees on the lines and double-sampling schemes are 
also considered. A number of examples are given, and computational pro¬ 
cedures are occasionally suggested. Although the main text is in Swedish, 
there is an extensive summary in English (20 pages), and the table and figure 
headings are given in both languages. 

This study is of interest to statisticians on two counts, firstly as a study 
of stochastic processes from the point of view of statistical application and 
secondly as a presentation of means of choosing methods of estimating the 
accuracy of surveys. Of course, between the theory and the application is 
the interpretation of the probability model. A line survey here is a com¬ 
pletely systematic survey; there is no chance mechanism involved in placing 
the lines. To enable him to apply probability calculus the author considers 
the formation of the topography under the grid lines as a stochastic process. 
The combined effects of Nature and man in giving the region the charac¬ 
teristics to be surveyed is taken to be that of a process defined by this model. 
Such an interpretation of the model leads to results different from those 
that would be obtained by considering placing a grid at random on a region 
with a given topography. 

Except for one feature of this study, this approach seems to give results 
that have reasonable interpretations for the practical problems. In fact, 
answers are given to questions which have previously been answered in¬ 
adequately or not at all. In devoting the remainder of the review to the one 
questionable feature the reviewer does not wish to detract from the other¬ 
wise fine work. 

The purpose of a survey q is to estimate some characteristic of a given re¬ 
gion Q. One would like to say something about the discrepancy between the 
estimate, say/(gf), and the value for this specific region, say/(Q). For exam¬ 
ple, if f(jq) were normally distributed with mean value f(Q) and variance 
T®, one could state the inequality f{q) -rt ^/(Q) ^/(g) +rf with confidence 
CL if t were chosen so the integral of the standard normal density from —i 
to t were a. Confidence a would be interpreted as being approximately the 
proportion of times such a statement would be correct if the statement were 
made for a number of situations (specified by /(gi), /(Qi), n*; 

Ta®;... ). a would be independent of how f(Q) would be chosen (as long as 
the choice is independent of /(g). In particular, if f(Q) were fixed the above 
theory would hold. 



BOOK REVIEWS 


325 


In his treatment the author takes f(Q) to be a random variable, not a 
parameter. Thus, JS'j [/(g)-/(Q)]®} is an average over all possible topo¬ 
graphical variations defined by the stochastic process over Q. But this is 
not the average which one would like to use to define the accuracy of a 
survey of a specific area. One would like the average taken over all topo¬ 
graphical variations for which f{Q) is the value that this region actually 
has. In short, one would like/(Q) to be treated as a constant. 

The probability model specifies a distribution of the set /(a, v) which is 
the product of the distribution of f(Q) and the conditional distribution of 
the set/(tt, v) given/(Q). The use of the distribution of/(Q) implies that we 
believe that f(Q) for the particular region Q we survey arose in a way de¬ 
scribed by the model. Now this is unrealistic because we may have chosen 
Q to survey simply because we think/(Q) is in a certain range of values. This 
question of using Bayes Theorem has been discussed so much that there is 
no need of doing more in a review than pointing out that this is what the 
author does. 

It is possible to avoid this difficulty by using the conditional distribution 
of f(u, v) given /(Q). Then the question arises as to whether the mathe¬ 
matical results of this investigation hold for the conditional distribution. 
Since the author^s conclusions are based on the distribution of f(q/) (values 
on line segments q/) and /(Q), we need only consider the conditional joint 
distribution of/(^,) given/(Q). The author’s conclusions rest on appro3dmate 
equalities of the sort E{Tk} —E{ T}, where Tk is a quadratic form in /(gy). 
Do these hold if/(Q) is held fixed; that is, is ^{^*1/(0)} «E{r|/(Q)} true? 
It can be shown that in general the second equality does not follow from the 
first. This is easily demonstrated if the original process is normal because 
then the joint distribution of /(gy) and f(Q) is normal. Thus, the actual 
conclusion we wish to draw from the theoretical study does not follow 
directly. 

It is possible, however, that these results hold in some approximate sense. 
If the region Q is large and if p(t) decreases rapidly, fiQ) is distributed with 
a small variance, and, hence, may behave nearly like a constant. Then the 
effect of holding /(Q) fixed may not be great. However, in order to justify 
his conclusions the author needs to study this question. 

The author has exercised considerable ingenuity in solving difficult mathe¬ 
matical problems. The development is rigorous, except that many approisd- 
mations are used; the effects of the approximations are studied to verify 
that they are legitimate. The problems of application considered seem to be 
important and interesting. It is to be hoped that the author in a future study 
will show that the objection to the use of Bayes Theorem approach can be 
met. 



326 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

Lecture Notes on Mathematical Theory of Probability and Statistics. Richard 
von Mises (Gordon McKay Professor of Aerodynamics and Applied Mathemat¬ 
ics, Harvard University, Cambridge, Mass.). Graduate School of Engineering, 
Special Publication No. 1. Cambridge 38, Mass.: Harvard Graduate School of 
Engineering (c/o Miss Natalie Nicholson, Librarian, Pierce Hall), 1947. Pp. 300. 
Paper, mimeographed. Out of print. 

Review by Benjamin Epstein 
Associate Professor of Mathematics, Wayne University 
Detroit 1, Michigan 

T hese are notes on a year course in probability and statistics given for 
undergraduates and graduates at Harvard University. The author has, 
in the opinion of this reviewer, given an excellent presentation of some of 
the most fundamental ideas in probability and statistics. It is a pity that 
these notes are not generally available. They would represent a welcome 
addition in a field where the really good elementary texts can be counted 
on the fingers of one hand. 

The frequency concept of probability is the dominant theme throughout 
the text and the author gives a very clear introduction to the basic notions 
underlying his approach to the axiomatic foundations of probability. In 
particular he defines at the outset the concepts of limiting frequency, chance 
as distinguished from randomness, collective, and the basic operations of 
place selection, mixing, partition, and combination. 

There are many workers in probability and statistics who object to the 
way in which von Mises formulates his axioms, to the way in which formal 
and empirical aspects of probability are mixed.* Personally this reviewer 
prefers the treatment of statistics as given by Cram^ in his recent book. 
Mathematical Methods in Statistics. However, 1 am of the opinion that it 
would be a mistake to make the preference so strong that one becomes blind 
to the originality and the basic character of the von Mises approach. While 
there appear to be grounds for preferring the measure-theoretic approach 
(which is of course at heart also a frequency theory) to that of von Mises, 
one should recall that von Mises was a pioneer in the field of probability and 
that he grappled, as far back as 1919, with many of the problems underlying 
the foundations of probability. 

This reviewer, when reading the text, was made keenly aware more than 
once of the fact that the author is a master in the field and is a man who 
is not satisfied with a superficial glossing over of difficulties. He meets diffi¬ 
culties head on and brings them out into the light of day where they can be 
analyzed. This is particularly true in Chapters 3, 6, and 9. In these chapters 
he deals with (a) the weak law of large numbers; (b) Bayes’ theorem and its 
relation to the problem of inference; (c) certain aspects of the Neyman- 

* For an evaluation of this point, tee: 

1. Mises, R. von. '‘On the Foundations of Probability and Statistics.* Ann Math Stat 12:191- 
206 Je '41. 

2. Boob, J. L. *Proba1^ty as Measure.” Ann Math Stat 12:206-14, Je '41. 

3. Mises, R. von, and Boob, J. L. "Discussion of Papers on Probability Theory.” Ann Math 
Stat 12‘.215-17 Je '41. 


BOOK REVIEWS 


327 


Pearson theory of testing h 3 rpotheses; {d) the contrast between Bayes’ 
theorem and the Neyman-Pearson theory of testing hypotheses; and (e) 
the confidence interval concept. He shows very clearly the role played by 
the a priori distribution of the unknown parameter or parameters in deter¬ 
mining the kind of probability inferences that one can draw from a sample. 
These are precisely the sort of things which should be brought to the attention 
of the student who is just entering the field of statistics. 

The short sections on runs and Mendelian heredity are also very good and 
illustrate important statistical concepts. Another valuable feature of the 
book is that many of the problems given to the student are non-trivial and 
thought-provoking. The problems are drawn from many fields and help to 
illuminate parts of the theory that might otherwise have remained unclear 
to the student. 

The treatment throughout is rigorous and the author does not hesitate to 
call on techniques in modern analysis which are about on the level of those 
given in Whittaker and Watson’s Modern Analysis, Theorems are carefully 
stated and where concepts are involved the author takes the trouble to point 
out what would happen if certain underlying assumptions were not satisfied. 

The sdection of certain sections or chapters for special comment is not to 
be taken to mean that the rest of the book is not well presented. The reviewer 
has only picked out a few of the things that were treated unusually well in 
this book and that are either omitted or glossed over in most texts. Some of 
the other topics are taken up as well in a book such as Cramer’s. 

By way of summary, it is my opinion that the continual emphasis by 
von Mises on underlying concepts makes good sense scientifically and peda- 
gogically. 1 also feel that this book gives the student some insight into the 
historical development of the subject and into the nature of the early diffi¬ 
culties and paradoxes. It is particularly the beginning student who needs to 
be reminded over and over again of just what assumptions are made when 
one makes a probability statement or an inference from data. It is refreshing 
to find an author who is honest enough to make the following explicit state¬ 
ment (Chap. 6, p. 16): "It remains an invariable fact, dominating all prob¬ 
lems in mathematical statistics, that no suhstanlial inference can he drawn 
from a small number of observations if nothing is known ‘a yriori,^ Le., pre¬ 
liminary to the experiments, about the object of experimentation” 

Statistics in School. W, L, Sumner (Senior Lecturer in Education, University Col¬ 
lege, Nottingham, England). Oxford, England: Basil Blackwell & Mott Ltd. (49 
Broad St.), 1948. Pp. vii, 183. 9s, 6d. 

Ebvibw by F. G, Cornbll 
Professor of Education and Director 
Bureau of Research and Service, College of Education 
University of Illinois, Champaign, Illinois 

T his book, evidently intended for use as a text, is a summary of lectures 
given to postgraduate students in education at the University College, 
Nottingham. As such, it lacks some of the features which might be expected 



328 


AMEBICAN STATISTICAL ASSOCIATION JOTJBNAL, JUNE 1949 


of a book purposely prepared as a textbook. The author recognizes the diffi¬ 
culties of students of education with standard mathematical treatments on 
statistics. This is a problem of teacher training institutions not only in 
Great Britain, but also in the United States. Unfortunately, a successful 
formula has not yet appeared for the intuitive, genuinely nonmathematical, 
yet technically adequate and sufficiently comprehensive, treatment of 
statistics which would reach larger numbers of educational leaders and the 
rank and hie of teachers. 

The author of this book is reasonably successful in some of his attempts 
“to make it as simple as possible.” Included are some topics needed and usu¬ 
ally not found in educational statistics such as an introduction to analysis of 
variance and factorial analysis. With these, however, there are questions 
of the value of the limited depth of treatment to which the author restricts 
himself evidently to avoid complexity. To what extent can a “simplified” 
text compromise with the criterion of adequacy of content coverage, for 
instance, by including factorial analysis but limiting its discussion to tetrad 
differences? In his presentation of partial correlation, the author covers 
only the three-variable problem without multiple regression, error of es¬ 
timate, and other concepts useful in understanding the subject. 

The chapter on correlation and regression is typical of what is likely to 
result from confining the subject of educational statistics to 183 pages. In 
that chapter there appear technical errors of omission or commission which 
would lead an immature student to incomplete or incorrect information. 
Regression is introduced primarily as a means of defining the Pearson-r. 
Thus, only the deviation form of the regression equation is given. The 
essential concepts of “error of estimate” and “prediction” oddly appear under 
a section headed Spearman^s Footrule, instead of in the section on “regres¬ 
sion” where it appears to belong. In this chapter, some technical and sematic 
difficulties appear in such condensed statements as: “In (c) there is perfect 
correspondence between the scores and correlation is complete and there 
is no regression” In education and psychology we usually find that cor¬ 
relation, if present, is partial positive correlation” Incidentally in a later chap¬ 
ter the author gives the old probable error for r formula, .6745 (1 — 
and the Fisher z transformation as “hyperbolic arctangent,” in a way which 
would not encourage use of the latter. 

Certain important omissions prevent this book from being a significant 
addition to the field. Not included is a modem approach to sampling. A 
careful distinction is not made between large sample and small sample 
method, “statistic” and “parameter.” Notions of sampling distribution are 
introduced through the medium of the “probable error” anachronism. Chi 
square is presented with the limited emphasis of “goodness of fit” and con¬ 
tingency. 

The book throughout would be improved if there was greater use made of 
applications to educational problems and if examples were not excluded 
more or less to the rather sterile subject of school marks. The writing is 
heavily orientated in the direction of marking, to which relatively little 



BOOK REVIEWS 


329 


concern has been given in education in this country in recent years. A 
broader orientation of a book on school statistics seems essential if it is to 
contribute significantly to the study of modem educational problems. 


Mathematical Treatment of the Results of Agricultural and Other Experiments, 
Second Edition. M. J. van Uven (Professor in the Agricultural University of 
Wageningen, Netherlands). Groningen-Batavia, Netherlands: P. Noordhoff 
F.V., 1946. Pp. viii, 310. 


Review by G. A. Baker 

Assistant Professor of Mathematics and Assistant Statistician 
in the Experiment Station^ College of Agriculture 
University of California, Davis, California 

T he present tendency in the design and analysis of field trials seems to be 
to carry the elaboration of the mathematical models first devised by 
R. A. Fisher to greater and greater extremes of complication. The theory 
of many sucji models is well worked out and tables provided so that the 
probabilities' of the errors in decision made in any trial can be estimated. The 
difficulty is that the Fisherian models, in many cases, do not sufficiently 
approach reality in their fundamental assumptions about the behavior of 
soil fertility. Professor van Uven is refreshingly careful and thorough in his 
fundamental assumptions about the behavior of soil fertility. He assumes, 
that soil fertility is a continuous function of position that changes slowly. 
Hjs methods in this respect agree with Neyman^s method of parabolic curves 
(1929). There is some evidence that, in some cases at least, this assumption 
of slowness of change in soil fertility is not sufficiently realistic. 

Professor van Uven's tests of significance are based on estimates of mean 
differences and standard errors obtained by the methods of least-squares 
and asymptotic averaging. Reference is then made to the normal probability 
curve regardless of the sample size and character of the population distribu¬ 
tion. The whole discussion of significance is naive as compared with those of 
J. Neyman, E. S. Pearson, and R. A. Fisher. 

Professor van Uven discusses some of the simpler Fisherian designs in 
detail. In this respect he says that although sometimes the influences at work 
on the yield will not fully justify the hypotheses yet in many cases they will 
be very well applicable. This remark, of course, applies to situations with 
which Professor van Uven is familiar. 

A list of technical terms in nine languages is included. 

The second edition is a photographic reproduction of the first edition pub¬ 
lished in 1935. Professor van Uven intended to make some changes based on 
his further experience in teaching and examining field trials but could not 
because of shortages due to the war. 

The book is wdl and logically written and proceeds on the basis of as¬ 
sumptions that are clearly stated. It will well repay study by anyone in¬ 
terested in the analysis of field trial data. 



830 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 

The Fourier Transforms of Probabilit7 Distributions. Aurel Wintner (Professor 
of Mathematics, Johns Hopkins University, Baltimore, Md.). Baltimore 18, 
Md.: the Author (Howland Hall, Charles and 34th Sts.), 1947. Pp. v, 185. Paper. 
$3.00. 

Review by J. Wolfowitz 
Associaie Professor of Mathematical Statistics 
Columbia University, New York B7, N. Y, 

T his publication is a set of notes, taken by Dr. P. W. light, of a course 
given by Professor Wintner in 1942-3. It begins with a description of a 
(one-dimensional) distribution function. Successive chapters are entitled: 
transforms, moments, convolutions, convergent convolutions, convergence 
in measure, semi-groups (i.e., infinitely divisible laws), inversions, examples, 
and projections (i.e., multidimensional and marginal distributions). Some 
references are given at the end of the book. 

The reviewer immensely enjoyed reading the book. Since it is a set of 
notes it does not treat the subject exhaustively and gives us only glimpses of 
the author’s attractive mathematical style. Yet it proceeds pleasantly and 
easily, and readily absorbs the reader’s attention. 

The reviewer would, however, like to take exception to the choice of sub¬ 
jects and certain aspects of their presentation. In spite of its name, this is 
not a book on Fourier transforms; it is a book on distribution functions 
studied with the aid of Fourier transforms. For example, the question of 
when a function is a characteristic function is barely discussed. To the best 
of the reviewer’s knowledge the phrase "chance variable” or its equivalent is 
never mentioned. Yet the author is in effect compelled to introduce chance 
variables to clarify certain concepts. His section on "Examples” discusses 
mappings which are in effect chance variables of only moderate interest for 
the student of probability theory. There seems little point to so scrupulous 
an avoidance of the customary terminology of probability theory, with its 
attendant values of brevity and suggestiveness. All the "chance variables” 
which occur in this book are independent; this is often an unnecessary limita¬ 
tion. The question of the convergence of a series of independent chance 
variables is given considerable attention, yet neither the law of large num¬ 
bers nor the central limit theorem are even mentioned. 

The reviewer was disappointed by the extreme paucity of appropriate 
references. The zero or one law, which is due to Kolmogoroff (generalization 
by Paul Ldvy) is wrongly ascribed by the author to Borel; the reference is 
Rendiconti Palermo, Vol. 29, 1909, where a special case of the strong law of 
large numbers is proved. The following minor remarks occur to the reviewer: 
(1) The proof (p. 168), that the distribution of Student’s ratio is the same 
for all underlying joint distributions which are radially symmetric, can be 
considerably shortened. It suffices to note that Student’s ratio is homogene¬ 
ous of degree zero; the statement is really valid for all such functions, pro¬ 
vided only that the functions are undefined on at most a set of probability 



BOOK REVIEWS 


331 


zero. (2) A simple proof by means of Fourier transforms of the fact that the 
totality of marginal distributions of every linear combination of h chance 
variables uniquely determines the joint distribution of the h variables (p. 
160) is due to Cramer and Wold and to be found in the former’s book, Ran¬ 
dom Variables and Probability Distributions, 

The reviewer is of the opinion that, while no advanced worker in mathe¬ 
matics will want to be without a copy of this book, it is not particularly 
suited for statisticians or elementary students of probability theory. This is 
a great pity, and it is to be hoped that the distinguished author can be 
persuaded to treat the subject more comprehensively and systematically in a 
future book. 


Random Normal Deviates: 25,000 Items Compiled From Tract No. KSIV (M. G- 
Kendall and B. Babington Smith’s Tables of Random Sampling Numbers). 
Herman Wold (Professor of Statistics and Director of the Institute of Statistics, 
University of Uppsala, Uppsala, Sweden). University of London, University 
College, Department of Statistics, Tracts for Computers. London N.W.l: Cam¬ 
bridge University Press (Bentley House, 200 Euston Road), 1948. Pp. xiii, 51. 
Paper. 5s. 


Review by H. Burke Horton 
Bureau of Transport Economics and Statistics 
Interstate Commerce Commission, Washington, D. C, 

S tatisticians are indebted to Professor Wold for substantially increasing 
the number of random normal deviates available for general use. Prior to 
the publication of this set of 25,000 items there were in published form only 
10,400 such items compiled by P. C. Mahalanobis from L. H. C. Tippett’s 
random numbers. 

In illustrating statistical processes a relatively small quantity of random 
normal deviates will usually suffice. However, for certain important re¬ 
search purposes, such as experimental deduction of the distribution function 
of a sampling statistic, large quantities of deviates are required. In this 
booklet Professor Wold has provided statisticians with 25,000 random 
normal deviates, each recorded to two decimal places. The items were 
derived from the Kendall-Smith tables of random numbers, column by 
column. It is unfortunate that Professor Wold’s labor was doubled, due to 
the fact that use of the Kendall-Smith tables by rows yielded a set for which 
the variance was too large by a significant amount (F =0.7%). In the words 
of the author, "It is difficult to decide whether the failure with the first set 
[by rows] is accidental or due to some slight defect in the construction of 
Kendall-Smith’s tables.” If the latter possibility is the source of trouble, 
such difficulties may in the future be avoided, or at least minimized, by the 
use of recently developed convergent processes to generate the underlying 
set of random digits. 

For the convenience of the user, sums and sums of squares for the fifty 



332 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1949 


items of each column are included. Tests for local randomness based upon 
sums, sums of squares, ranges, and sign runs, were applied to each page (500 
items), to blocks of 5,000 items, and to the entire set. The results were in 
accord with the hypothesis of normality. Subsets yielding unusual test 
results are listed for the benefit of users of small sets from the table. For 
clarity, in line 2, page vii, substitute “.0100 and .9900” for “0 and 1”. 

Textual material accompanying the tables is clearly and concisely written. 
The author presents illustrated techniques for the construction of univariate, 
bivariate, and multivariate normal distributions, with specified underlying 
parameters. The material presented on the construction of a multivariate 
normal distribution is of particular value as a reference. This booklet will be 
a useful addition to the libraries of statisticians and statistical organizations 
requiring sizeable quantities of random normal deviates for research or edu¬ 
cational purposes. 

Say It With Figures. Hans Zeisel (McCann-Erickson, New York City). New 
York 16: Harper & Brothers (383 Madison Ave.), 1947. Pp. xix, 250. $3.00. 

Review by Gregor Sebba 
Professor of Economics, The University of Georgia 
Athens, Georgia 

T he calamitt that recently befell the public opinion polls points up the 
fact that the development of statistical methodology and techniques has 
by far outpaced the analysis of what Professor Paul P. Lazarsfeld, in his 
introduction to Hans Zeisel^s book Say It With Figures aptly terms “the 
conceptional meaning of statistical procedures.” Since Albert B. Blankenship 
reviewed the book in this Journal (42: 666-7 D '47) from the point of view 
of commercial research only, it seems appropriate to discuss briefly its con¬ 
tribution to conceptual analysis and its usefulness to teachers of statistics. 

Zeisel's book deals with three broad subjects: “Problems of Classification” 
(Part I), “Means of Numerical Presentation” (Part II) and “Tools of 
Causal Analysis” (Part III). Part I is primarily meant for users of the ques¬ 
tionnaire and interview methods; Zeisel's discussion of “Don't know” 
answers (which can easily be adapted to “Undecided” answers) is partic¬ 
ularly illuminating when applied to political opinion polls. But it is Parts II 
and III upon which the importance of the book rests. Among the “Means of 
Numerical Presentation,” Dr. Zeisel singles out percentages and simple 
indices for a penetrating analysis of their logic. The use of per cent figures is 
not generally advisable but needs “specific justification” and can be decided 
upon “only with a complete background of concrete data and specific circum¬ 
stances” (p. 72). Their use for comparing increase or decrease in two or more 
populations, in particular, is logically justified only if the change is (or is 
treated as being) “in exact proportion to the factors chosen as a base for per 
cent computation” (p. 80). It thus turns out that per cent comparisons 



BOOK REVIEWS 


333 


"offer only approximate solutions*^ since they merely discount "a pnon the 
effects of concomitant variates” (R. A. Fisher ); hence their use for compari¬ 
son “will be justified [only] to the extent to which this a priori reasoning 
proves correct” (p. 81). In a two-dimensional table, per cents should be run 
in the direction of the variable to be studied for its effect, provided the 
sample is representative in this direction; if it is not, proper weighting be¬ 
comes necessary (pp. 105-6). Of particular interest is Zeisel's subsequent 
study of the problem of reducing three- and more-dimensional tables (chap. 
6), a problem arising because “only tables containing two variables can be 
presented in their entirety and still be clearly readable” (p. 127). There fol¬ 
lows an illuminating discussion of simple indices of the type developed in 
sociometrics, leading up to the warning that “there is a certain danger that 
somewhere along the way from a clearly defined object to its mathematical 
symbols, the clarity of thought is lost; indices sometimes pretend to measure 
a concept which .. . turns out to be ambiguous and, therefore, not measur¬ 
able. Neither a descriptive label nor an impressive mathematical formula 
are a safeguard against. . . indices which do not measure what they purport 
to measure” (p. 166). 

Part III, “Tools of Causal Analysis,” contains a superior treatment of 
cross-tabulation as a tool of research. Cross-tabulation “refines” and “ex¬ 
plains”—^though the explanation may turn out to be spurious; the distinction 
between “true” and “spurious” inter-variable correlation depending on 
whether or not the correlation reflects a direct causal connection, i.e., 
whether the explaining factor is asymmetrically or symmetrically connected 
with the two variables (p. 202). Answering the question when to cross- 
tabulate, the author sets down the rule that if a result is analyzed succes¬ 
sively by various breakdowns and it is known or suspected that some of them 
are interrelated, then these interrelated breakdowns should be tabulated, 
not successively, but simultaneously (p. 203). A critical discussion of the 
panel technique of interviewing concludes the book. 

Dr. Zeisel’s study represents a step forward on the road indicated by such 
classical earlier treatises as Zizek’s study of averages (1913), Winkler’s 
monograph on relatives (1923) and Haberler’s analysis of index numbers 
(1927). The book might be termed an essay in the logic of statistics pro¬ 
cedures. The teacher of elementary and applied statistics will find in it use¬ 
ful numerical examples and charts of great forcefulness; and while much of 
the author’s discussion is too refined for beginners, there remains enough to 
enable the teacher to go through the initial chapters of an elementary text 
without putting the students (and himself) to sleep. 

Although Professor Lazarsfeld claims that “some of the inevitable errata” 
have been corrected in the 1948 printing, others seem to have passed un¬ 
noticed; among them an ugly blemish: in Table X-19 (pp. 240-2), there 
occur three questions and answers of the form: “If Hitler offered peace now 
.. ., would you favor or oppose such a peace?” Answer: “Yes.” 




JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Volume 44 


Septbmbeb 1949 


Number 247 


ARTICLES 

The Monte Carlo Method . Nicholas Metropolis and S. Ulam 335 

Applications of Some Significance Tests for the Median Which Are Valid 

Under Very General Conditions. John E. Walsh 342 

A Sampling Study of the Merits of Autoregressive and Reduced Form 

Transformations in Regression Analysis. 

.Guy H. Orcutt and Donald Cochrane 356 

Control of a General Census by Means of an Area Sampling Method 

.Gabriel Chevrt 373 

A Procedure for Objective Respondent Selection Within the Household . 

.Lbsub Kish 380 

Beneficiary Statistics Under the Old-Age and Survivors Insurance Program 
and Some Possible Demographic Studies Based on These Data . 
. Robert J. Myers 388 

By-Product Data and Forecasting in Unemployment Insurance . 

.Nathan Morrison 397 

Statistical Requirements for Economic Mobilization. 

.Ralph J. Watkins 406 

The War Production Board’s Statistical Reporting Experience, V and VI 

.David Novick and George A. Steiner 413 

BOOK REVIEWS 

Croxton, Frederick E. and Cowdbn, Dudley J., Practical Business 

Statistics, Second Edition .Alfred Cahen 444 

Dahlbero, Gunnar, Mathematical Methods for Population Genetics . . 

.Howard Levene 447 

Emmens, C. W., Principles of Biological Assay . . Lila F. Knudsen 448 

Greenshiblds, Bruce D., Schapiro, Donald, and Ericksen, Elroy L., 

Traffic Performance at Urban Street Intersections. Henry K, Evans 451 

















Jeffbets, Habold, Theory of Probability, Second Edition . 

.Hebbebt Bobbins 453 

Kendall, Maubxce G., Rank Correlation . . . . E. J. G. Pitman 454 

Pansb, V. G., Report on the Scheme for the Improvement of Agricultural 

Statistics . S. Lee Crump 455 

Rtckeb, Wiluam E., Methods of Estimating Vital Statistics of Fish Popvr 

hxUons . Charles M. Mottlet 456 

Wiles, S. S., Elementary Statistical Analysis . . . . T. A. Bancroft 458 

Letters About Books.460 


^dex to Volum^ 1-54,1888-1939, may be obtained from the ASA. The Journal 
IS also indexed in the Industrial Arts Index and the Public Affaiis Information 
Service Bulletin. 







JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Number 2J^7 SEPTEMBER 1949 Volume <44 

THE MONTE CARLO METHOD 

Nicholas Metropolis akd S. Ulam 
Los Alamos Laboratory 

We shall present here the motivation and a general descrip¬ 
tion of a method dealing with a class of problems in mathe¬ 
matical physics. The method is, essentially, a statistical 
approach to the study of differential equations, or more 
generally, of integro-differential equations that occur in 
various branches of the natural sciences. 

Already in the nineteenth century a shaip distinction began to ap- 
pear between two different mathematical methods of treating 
physical phenomena. Problems involving only a few particles were 
studied in classical mechanics, through the study of systems of ordinary 
differential equations. For the description of systems with very many 
particles, an entirely different technique was used, namely, the method 
of statistical mechanics. In this latter approach, one does not concen¬ 
trate on the individual particles but studies the properties of sets of 
particles. In pure mathematics an intensive study of the properties of 
sets of points was the subject of a new field. This is the so-called theory 
of sets, the basic theory of integration, and the twentieth centuiy de¬ 
velopment of the theory of probabilities prepared the formal apparatus 
for the use of such models in theoretical physics, i.e., description of 
properties of aggregates of points rather than of individual points and 
their coordinates. 

Soon after the development of the calculus, the mathematical ap¬ 
paratus of partial differential equations was used for dealing with the 
problems of the physics of the continuum. Hydrodynamics is the most 
widely known field formulated in this fashion. A little later came the 
treatment of the problems of heat conduction and still later the field 
theories, like the electromagnetic theory of Maxwell. All this is very 
well known. It is of course important to remember that the study of the 


335 



386 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

physics of the continuum was paralleled through "kinetic theories.” 
These consist in approximating the continuum by very large, but finite, 
numbers of interacting particles. 


When a physical problem involves an intermediate situation, i.e., a 
system with a moderate number of parts, neither of the two approaches 
is very practical. The methods of analytical mechanics do not even give 
a qualitative survey of the behavior of a system of three mutually at¬ 
tractive bodies. Obviously the statistical-mechanical approach would 
also be unrealistic. 

An analogous situation exists in problems of combinatorial analysis 
and of the theory of probabilities. To calculate the probability of a 
successful outcome of a game of solitaire (we understand here only such 
games where skill plays no role) is a completely intractable task. On the 
other hand, the laws of large numbers and the asymptotic theorems of 
the theory of probabilities will not throw much light even on qualitative 
questions concerning such probabilities. Obviously the practical pro¬ 
cedure is to produce a large number of examples of any given game and 
then to examine the relative proportion of successes. The "solitaire” is 
meant here merely as an illustration for the whole class of combina¬ 
torial problems occurring in both pure mathematics and the applied 
sciences. We can see at once that the estimate will never be confined 
within given limits with certainty, but only—if the number of trials is 
great—^with great probability. Even to establish this much we must 
have recourse to the laws of large numbers and other results of the 
theory of probabilities. 

Another case illustrating this situation is as follows: Consider the 
problem of evaluating the volume of a region in, say, a twenty-dimen¬ 
sional space. The region is defined by a set of inequalities 

Mxi, X 2 - • • X 20 ) < 0;f2(xi, X 2 • ' • X 20 ) < 0; • • • f 2 o(xi, X 2 - • • X 20 ) < 0 . 

This means that we consider all points(a:i, Xa, Xs, • • • Xao) satisfying 
the given inequalities. Suppose further that we know that the region is 
located in the unit cube and we know that its volume is not vanishingly 
small in general. The multiple integrals will be hardly evaluable. The 
procedure based on the definition of a volume or the definition of an 
integral, i.e., the subdivision of the whole unit cube, forexample, each 
coordinate xi into ten parts, leads to an examination of lattice 
points in the unit cube. It is obviously impossible to count all of them. 
Here again the more sensible approach would be to take, say 10^ points 



THE MONTE CARLO METHOD 


337 


at random from this ensemble and examine those only; i.e., we should 
count how many of the selected points satisfy all the given inequalities. 
It follows from simple application of ergodic theorems that the estimate 
should be, with great probability, valid within a few per cent. 

As another illustration, certain problems in the study of cosmic rays 
are of the following form. An incoming particle with great energy 
entering the atmosphere starts a whole chain of nuclear events. New 
particles are produced from the target nuclei, these in turn produce new 
reactions. This cascade process continues with more and more particles 
created until the available individual energies become too small to 
produce further nuclear events. The particles in question are protons, 
neutrons, electrons, gamma rays and mesons. The probability of pro¬ 
ducing a given particle with a given energy in any given collision is 
dependent on the energy of the incoming particle. A further complica¬ 
tion is that there is a probability distribution for the direction of mo¬ 
tions. Mathematically, this complicated process is an illustration of a 
so-called Markoff chain. The mathematical tool for the study of such 
chains is matrix theory. It is obvious that in order to obtain a mathe¬ 
matical analysis, one would have to multiply a large number of (nXn) 
matrices, where n is quite great. 

Here again one might try to perform a finite number of “experiments’^ 
and obtain a class or sample of possible genealogies. Those experiments 
will of course be performed not with any physical apparatus, but theo¬ 
retically. If we assume that the probability of each possible event is 
given, we can then play a great number of games of chance, with 
chances corresponding to the assumed probability distributions. In this 
fashion one can study empirically the asymptotic properties of powers 
of matrices with positive coeflBicients, interpreted as transition proba¬ 
bilities. 


II 

Finally let us consider more generally the group of problems which 
gave rise to the development of the method to which this article is de¬ 
voted. Imagine that we have a medium in which a nuclear particle is 
introduced, capable of producing other nuclear particles with a distri¬ 
bution of energy and direction of motion. Assume for simplicity that all 
particles are of the same nature. Their procreative powers depend, how¬ 
ever, on their position in the medium and on their energy. The problem 
of the behavior of such a system is formulated by a set of integro- 
differential equations. Such equations are known in the kinetic theory 
of gases as the Boltzmann equations. In the theory of probabilities one 



338 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 


has somewhat similar situations described by the Fokker-Planck equa¬ 
tions. A very simplified version of such a problem would lead to the 
equation: 


du(x, y, z) 
dt 


a(x, y, z)Au + b{x, y, z)u(x, y, z) 


( 1 ) 


where u{Xf y, z) represents the density of the particles at the point 
(x, y, z). The Laplacian term, aAu on the right hand side corresponds 
to the diffusion of the particles, and bu to the particle procreation, or 
multiplication. [In reality, the equation describing the physical situa¬ 
tion stated above is much more complicated. It involves more inde¬ 
pendent variables, inasmuch as one is interested in the density w(Xf y, z; 
Vs, Vy, t?*) of particles in phase space, v being the velocity vector.] The 
classical methods for dealing ^\dth these equations are extremely labori¬ 
ous and incomplete in the sense that solutions in "closed form” are un¬ 
obtainable. The idea of using a statistical approach at which we hinted 
in the preceding examples is sometimes referred to as the Monte Carlo 
method. 

The mathematical description is the study of a flow which consists of 
a mixture of deterministic and stochastic processes.^ It requires its own 
laws of large numbers and asymptotic theorems, the study of which has 
only begun. The computational procedure looks in practice as follows: 
we imagine that we h^ve an ensemble of particles each represented by 
a set of numbers. These numbers specify the time, components of 
position and velocity vectors, also an index identifying the nature of 
the particle. With each of these sets of numbers, random processes are 
initiated which lead to the determination of a new set of values. There 
exists indeed a set of probabilitj^ distributions for the new values of the 
parameters after a specified time interval At, Imagine that we draw at 
random and independently^ values from a prepared collection possessing 
such distributions. Here a distinction must be made between those 
parameters which we believe vary independently of each other, and 
those values which are strictly determined by the values of other 
parameters. To illustrate this point: assume for instance that in the 
fission process the direction of the emitted neutron is independent of 
its velocity. Or again, the direction of a neutron in a homogeneous 
medium does not influence the distance between its origin and the site 
of its first collision. On the other hand, having "draiMi” from appropri- 


1 von Neamaim. J., and Ulam» S» BvXUtin Abstract 51-S-165 (1945). 



THE MONTE CARLO METHOD 


339 


ate distributions the velocity of a new-born particle and the distance to 
its first collision, the time elapsed in travel is completely determined 
and has to be calculated accordingly. By considering a large number 
of particles with their corresponding sets of parameters we obtain in 
this fashion another collection of particles and a new class of sets of 
values of their parameters. The hope is, of course, that in this maimer 
we obtain a good sample of the distributions at the time t+AL This 
procedure is repeated as many times as required for the duration of the 
real process or else, in problems where we believe a stationary distribu¬ 
tion exists, until our “experimental” distributions do not show signifi¬ 
cant changes from one step to the next. 

The essential feature of the process is that we avoid dealing with 
multiple integrations or multiplications of the probability matrices, 
but instead sample single chains of events. We obtain a sample of the 
set of all such possible chains, and on it we can make a statistical study 
of both the genealogical properties and various distributions at a given 
time. 


Ill 

We want now to point out that modern computing machines are 
extremely well suited to perform the procedures described. In practice, 
the set of values of parameters characterizing a particle is represented, 
for example, by a set of numbers punched on a card. We have at the 
outset a large number of particles (or cards) with parameters reflecting 
given initial distributions. The step in time consists in the production 
of a new such set of cards. The original set is processed one by one by a 
computing machine somewhat as follows: The machine has been set up 
in advance with a particular sequence of prescribed operations. These 
divide roughly into two classes: (1) production of “random” values 
with their frequency distribution equal to those which govern the 
change of each parameter, (2) calculation of the values of those pa¬ 
rameters which are deterministic, i.e., obtained algebraically from the 
others. It may seem strange that the machine can simulate the produc¬ 
tion of a series of random numbers, but this is indeed possible. In fact, 
it suffices to produce a sequence of numbers between 0 and 1 which 
have a uniform distribution in this interval but otherwise are uncorre¬ 
lated, i.e., pairs will have uniform distribution in the unit square, 
triplets uniformly distributed in the unit cube, etc., as far as practically 
feasible. This can be achieved with errors as small as desired or practi¬ 
cal. What is more, it is not necessary to store a collection of such num¬ 
bers in the machine itself, but paradoxically enough the machine can 



340 


AMEEICAN STATISTICAL ASSOCIATION JOXJENAL, SEPTEMBER 1949 


be made to produce numbers simulating the above properties by iterat¬ 
ing a well-defined arithmetical operation. 

Once a uniformly distributed random set is available, sets with a 
prescribed probability distribution/(rr) can be obtained from it by first 
drawing from a uniform uncorrelated distribution, and then using, in¬ 
stead of the number x which was drawn, another value y — g{x) where 
g(x) was computed in advance so that the values y possess the distribu¬ 
tion f(y). 

Regarding the sequence of operations on a machine, more can be and 
has been done. The choice of the kind of step to be performed by the 
machine can be made to depend on the values of certain parameters just 
obtained. In this fashion even dependent probabilistic processes can be 
performed. Quite apart from mechanized computations, let us point 
out one feature of the method which makes it advantageous with, say, 
stepwise integration of differential equations. In order to find a par¬ 
ticular solution, the usual method consists in iterating an algebraical 
step, which involves in the nth stage values obtained from the (n— l)th 
step. The procedure is thus serial, and in general one does not 
shorten the time required for a solution of the problem by the use of 
more than one computer. On the other hand, the statistical methods 
can be applied by many computers working in parallel and independ¬ 
ently. Several such calculations have already been performed for prob¬ 
lems of types discussed above.* 

IV 

Let us indicate now how other equations could be dealt with in a 
similar manner. The first, purely mathematical, step is to transform the 
given equation into an equivalent one, possessing the form of a diffusion 
equation with possible multiplication of the particles involved. For 
example as suggested by Fermi, the time-independent Schrodinger 
equation 

y, z) = (E - 7)V'(a;, y, z) 

could be studied as follows. Re-introduce time dependence by consider¬ 
ing 

y, 2 ,0 - y, z)c-*‘ 

M will obey the equation 

du 

— = — Vu, 

at 


* Amons others, problems of diffusion of neutrons, gamma rays, etc. To cite an example involving 
the study of matrices, there is a recent paper by Goldberger, Phya. Rea. 74, 1269 (1048), on the inter¬ 
action of high energy neutrons with heavy nuclei. 




THE MONTE CABLO METHOD 


341 


This last equation can be interpreted however as describing the be¬ 
havior of a system of particles each of which performs a random walk, 
i.e., diffuses isotropically and at the same time is subject to multiplica¬ 
tion, which is determined by the value of the point function V. If the 
solution of the latter equation corresponds to a spatial mode multiply¬ 
ing exponentially in time, the examination of the spatial part will give 
the desired y, z )—correspondiag to the lowest "eigenvalue” E. 

The mathematical theory behind our computational method may be 
briefly sketched as follows: As mentioned above and indicated by the 
examples, the process is a combination of stochastic and deterministic 
flows.^ In more technical terms, it consists of repeated applications of 
matrices—^like in Markoff chains—^and completely specified trans¬ 
formations, e.g., the transformation of phase space as given by the 
Hamilton differential equations. 

One interesting feature of the method is that it allovrs one to obtain 
the values of certain given operators on functions obeying a differential 
equation, without the point-by-point knowledge of the fimctions which 
are solutions of the equation. Thus we can get directly the values of the 
first few moments of a distribution, or the first few coefficients in the 
expansion of a solution into, for example, a Fourier series without the 
necessity of first “obtaining” the fimction itself. “Sjmbolically” if one 
is interested in the value oiUiJ) where 17 is a functional like the above, 
and / satisfies a certain operator equation ^C0=0, we can in many 
cases obtain an idea of the value otU{f) directly, without “knowing”/ 
at each point. 

The asymptotic theorems so far established provide the analogues of 
the laws of large numbers, such as the generalizations of the weak and 
strong theorems of Bernoulli, Cantelli-Borel.* The more precise in¬ 
formation corresponding to that given in the Laplace-Liapounoff 
theory of additive processes has not yet been obtained for our more 
general case. In particular it seems very difficult to estimate in a precise 
fashion the probability of the error due to the finiteness of the sample. 
This estimate would be of great practical importance, since it alone 
would allow us to suit the size of the sample to the desired accuracy. 

The “space” in which our process takes place is the collection of all 
possible chains of events, or infinite branching graphs.^ The general 
properties of such a phase space have been considered but much 
work remains to be done on the specific properties of such spaces, each 
corresponding to a given physical problem. 


3 Everett, C. J. and Ulam, S., U S.A.E.C., Loa Alamos reports LADC-533 and LADC-534. De- 
dassified. 1948. 

* Everett, C. J. and Ulam, S., Proe Nat. Acad. Sciences, 34,403 (1048). 




APPLICATIONS OF SOME SIGNIFICANCE TESTS FOR 
THE MEDIAN WHICH ARE VALID UNDER 
VERY GENERAL CONDITIONS* 


John E. Walsh 
The RAND Corporation 

In two other papers ([1] and [2]) order statistics were used to 
derive some tests for the population median which have sig¬ 
nificance levels which are either exact or bounded under some 
very general conditions. These order statistic tests were found 
to be very efiSlcient for small samples from a normal popula¬ 
tion; also they can be applied with very little computation. 
This paper contains applications of these tests to several well 
known statistical problems. Also a graphical method of apply¬ 
ing the tests of [1] is outlined. 

INTRODUCTION 

T he significance tests for the population median derived in [1] are 
valid if the n observations on which a test is based are independent 
and are drawn from n populations satisf 3 ring the conditions 

1) Each population is continuous (i.e. its cumulative distribution 
,.. is continuous). 

^ ^ 2) Each population is symmetrical. 

3) The median of each population has the same value <l>. 

These tests compare 4> with a given constant value 0o. 

The tests for comparing with <^o derived in [2] are based on the as¬ 
sumption of a sample from a normal population. The significance level 
of these tests, however, is bounded near the value for normality if the 
n independent observations are from populations necessarily satisfying 
only conditions (A). 

An important feature of conditions (A) is that no two of the observa¬ 
tions are necessarily dravm from the same population. Thus the order 
statistic tests can be applied to a wide variety of situations. 

The motivation for introducing tests with bounded significance levels 
in addition to the tests of [1] was to obtain a wider variety of suitable 
significance levels for small values of n without greatly weakening the 
generality of application. 

* The results presented in this paper were obtuned in the course of research conducted under the 
sponsorship of the OfBce of Naval Research. This research was performed while the author was at 
Princeton University. 


342 




TABLE 1 

SOME ONE-SIDED AND SYMMETRICAL SIGNIFICANCE TESTS FOR n^l5 


Significance 
Level of 
Tests 


Tests 

Symmstrical: Accept if either 


One Symmet- 
sided ncal 


Onesided: 
Accept ^ <0e if 


One-8tded: 
Accept ^>^tif 




9.4% max [xs, K24+x«)]<^a 
6 . 2 % 

3.1% Xa<^o 


max [xi, K*4+a;»)]<^« 
max [x«, l(x>+X7)]<^« 
iCxa+x?) <^a 

XT<^a 


max [xe, i(a;4+a!t)]<^t 
max Xa, i(Xa 4*2^8) <^a 
max [xr, i(a;a4‘jB8)]<^a 
i(x7+Xa) <^a 
X8<^a 


max xa, |(x4+^i). <^t 
max .X7i iCxs+^t) 
max xa, I(x8+£a)]<^« 
max [xa, K«7+»a) <^a 
l(xa+xa) <^a 


max [xa, ifxa+Xia) <^8 
max [x7i Kxc'f'S^ta) <i^a 
max [xa, |(xa+>i«} 
max [xa, i(Xa+Xxa) <^a 


max [x7, K^4+^ii)I<^a 
max [x7, i(xB+xii)]<^8 
max ,K*«4“*i0i i{®«4"*»)l 
max [xa, i(x7+®u)]<^a 


max [lC**+«w)i }fa;i+JCu)I<08 
max [xa* J (xi +X13) 1 <^a 
max [xai §(xa 4 ~Xia)] <^a 
max [ifxT +X1O1 i (x§ 4-xib) J <0a 


max [i(x4+xii), l(xi+xtf)1<^a 
max [i(xi4-xii), i(xt-\-xu)i 
max [Kxa+Xialt J(xa4“Xia)l<^a 
max [xia, ifx7+xii)] <^a 


max [5 (x*+xi 4), i(xi-|-XM)I<0* 
max [Kxa+xu), irxa+XM)]<08 
max [xia, |(Xf+xiO]<^o 
max [i(x7 4'Xi4)t J(®it+xii)]<^« 


max [ifxi+xiOi i(x#4*xi4)] <^a 
max [|(xa+XM)t i(Xa+X»4)] <0a 
max [^(xb+xm), }(xia+Xu)] 
max [xix, 3(x7+x«)I<^a 


min [xi, i(xi-Hca)] >^i 
i(xi+xs) >^8 
Xi>^a 


min [xi, $(xi+X4)]>0a 
min [xa, 5 (xi+*8)]>^b 

4 (Xi +Xi) >^a 
Xi>^a 


min [xi, i(xa+xa) >^a 
min [xa, }(xi+X4) >^a 
min [xi, ^(xi+ 3 ?i)]>^b 
J( xi +x 0 >^» 

Xt>^a 


min [x4, |(xi-+-xi)]>^a 
min Xt, Mxt+«a)]>^a 
min Xf, }(xi 4 * 9 ^ 4 ) ] ><^« 
min [xi, ^(xi4-xa)]>^a 

§(Xi4-Xa) >^a 


min fxi, Kxi4’X7)]>^a 
min [x4, i(xi4-XB). >^a 
min [xi, i(xt4'Xi) ><^b 
min [xi, i(xi4-xi)]>^a 






min [xi, i(xi4'Xa)] >^a 
min [xi, i(xi4-X7)]>0B 
min [}(xt4~xa)t i(xa4*X4)] >^a 
min [xip i(xt4’Xi)]>0a 


min [l(xx 4 'Xa)» i(xi+xg)]>^t 
min [xa, i(xi 4 -X 8 )]>^a 
min [x 4 , ifxi 4 -X 7 )]>^a 
min [}(xi 4 -X 4 }, i(x 84 -X 4 )] >^a 


min Kxi 4'Xia), i(xa 4^t) ] 
min 4f®»4"*t)» i(**4'X8)l>^8 
min [Kxi+xb). K*4 4-XB)]>0a 
min [x4, i(xt+X7)]>0a 


min 4(xi4«n), i(xt4~xi8)]>^a 
min 4f^i4'Xia)t §(x!i4^t)l>^« 
min Xi. }(Xi4‘Xa)]>^a 
min 4(*i4<ci)i i(x4 4"Xi)]>^B 


min 4(xi 4 -Xu), }(xa 4 ’Xii) ] >^a 
min iCxi4^u)> i(xa4-xa8)]>0a 
min .I(xi4^«)» i(®i4«4)l>^« 
min [xi. |(xi+XB)]>^a 








































































344 AMEBICAN STATISTICAL ASSOCIATION JOUBNAL, SEFTBMBEB 1D48 

The purpose of this paper is to use these two types of tests to obtain 
generalized solutions for the cases of quality control, slippage, and the 
sign test. Also direct applications of the tests are considered. 

Tables 1 and 2 contain a list of some practically important one-sided 
and symmetrical tests of the two t 3 rpes considered (xi, • • •, a:„ repre¬ 
sent the values of the n observations arranged in increasing order of 
magnitude). The tests of Tables 1 and 2 were chosen so that the signifi¬ 
cance levels are approximately 6 %, 2.5%, 1%, 0.5% for one-sided tests 
and 10%, 5%, 2%, 1% for symmetriccd tests; also so that the amount 
of computation required for the application of a test is small. 

To clarify the use of the tests of Table 1, consider the following 
example: Let n = 10 and the values of the 10 observations be 

0.7, - 1 . 1 , -0.2, -1.2, 0.1, 3.4, 3.7, 1.3, 1.8, 2.0. 

The hypothesis to be tested is that the common median of the con¬ 
tinuous symmetrical populations from which these independent ob¬ 
servations were drawn has the value zero. Then ^o=0 and 


T —1 

1 

II 

Xt = 1.3 

II 

1 

00 

II 

Xt — — 0.2 

11 

ISO 

b 

*4 = 0.1 

Xt = 3.4 

II 

p 

aJio = 3.7 


Apply the symmetrical test of Table 1 at the 5.1% significance level 
to these observations. Then 

max [ 3 : 7 , §(a:s - 1 - iio)] = max ( 1 . 8 , 2 . 2 ) = 2.2 > 0 

min [x 4 , §(zi -f- are)] = min ( 0 . 1 , 0 . 1 ) = 0.1 > 0 . 

Hence ^ is significantly different from zero at the 5.1% significance 
level. 

As an example of the application of the tests of Table 2 let n =6 and 
the observations be 

-0.01, 0.21, -0.72, 0.00, -0.05, -1.81. 

The hypothesis to be tested is that the common median of the continu¬ 
ous sjrmmetrical populations from which these six independent observa¬ 
tions were drawn has the value 0.1. Then ^o=0.1 and 

a:i = — 1.81 Xi — — 0.01 

Xi=: - 0.72 16 = 0.00 

Xt = — 0.05 xa — 0.21. 



ME ONE-SIDED AND SYMMETRICAL TESTS WITH BOUNDED SiaNIPICANCE LEVELS 


siamncAircB tests fob median 


345 



[xi, (.5xi+.28a;T+.22®i)] <^o (.6»i+.28xi+.22a;i)] >0# approx, approx. 






346 AMEBJCAN STATISTICAL ASSOCIATION JOUKNAL, SEPTEMBER WiO 

Apply the symmetrical test of Table 2 \rhich has a 5% signihcaace 
level if the n obsen^ations are a sample from a normal population. Then 

.63*4 + .'SlXi = 0.132 + 0.000 = .132 > 0.1 
.6311 + .37x2 = - 1.146 - 0.266 = - 1.412 < 0.1. 

Hence the median is not significantly different from 0.1 on the basis of 
this test. If the six observations are a sample from a normal population, 
the significance level of the test is 5%. If only conditions (A) are neces¬ 
sarily satisfied, however, the significance level of the test is bounded 
between 6.2% and 3.1%. 

As pointed out in [1], if a symmetrical population has a mean, the 
mean equals the median. Thus the order statistic tests are tests of the 
mean if the populations considei'ed have a mean and conditions (A) are 
satisfied. 

The efficiencies listed for the tests in Tables 1 and 2 refer to the power 
efficiency of the order statistic test considered for the case in which the 
n observ'ations are a sample from a normal population. The definition 
of the power efficiency of a test is given in section 3 of [1]. Essentially 
determination of the power efficiency of a t^t consists in finding the 
sample size (not necessarily integral) of the most powerful test of the 
specified hypothesis (in this case the t-test) which has approximately 
the same power function as the given test; this sample size divided by 
the sample size for the given test is called the power efficiency of that 
test. An intuitive explanation of the meaning of power efficiency is 
given in [3], Roughly speaking, the power efficiency of a test is the 
percentage of the total available information per observation which is 
being utilized by that test. 

Section 2 contains an outline of a graphical method of applying cer¬ 
tain of the tests derived in [1]. The remaining sections contain methods 
of applying the tests listed in Tables 1 and 2 to several well known 
statistical problems. For simplicity the tests of Tables 1 and 2 w'ill be 
referred to as the tests of section 1. If tests for n>16 or near different 
significance levels than those approximated by the tests of Tables 1 and 
2 are required, such tests can usually be obtained from the results given 
in[l]. 

GRAPHICAL APPLICATION OP TESTS 

Let US consider a graphical method applying the tests of Table 1. 
Since max (x, y) <^o has the interpretation that both x and y are less 
than 4>o while min (x, y) > means that both x and y are greater than 
4>t, it is only necessary to develop a method of deciding how expressions 



SIGNIFICANCE TESTS FOB MEDIAN 


347 


(b,b) 



(a,a) 


FIG. 1. SCHEMATIC DUGKAM OF BEGI0N8 A AND B 

of the fonns *,• and compare with <f>t. Direct comparison ac¬ 

complishes this for the case of Xi. Thus it is sufficient to determine a 
graphical method of deciding whether K®<+^/)>0o or 
For nearly all situations to which the tests of Tabic 1 would be ap¬ 
plied, the n observations can be considered to have practical upper and 
lower limits, say a and b. Using these limits a graphical method of 
deciding when §(a:i+®i) <^o, consists in constructing the region 
A of the Xi, Xj-plane defined by 

+ ®y) < ^0, Xi <Xj, o g Xi, Xj g b. 

If the point (xi, Xj) falls in this region, K*<+*y)<^o. Similarly 
!(*<+%)> ^0 if (xi, Xj) falls into the region B defined by 

^(Xi -i- X,) > ^ 0 , Xi < Xj, a ^ Xi, Xj g b. 



348 AMEEICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

Fig. 1 contains a schematic diagram of the regions A and B, These 
regions are particularly easy to construct because the line =0o 

is perpendicular to the line Xi^Xj at the point (0o, <^o). 

As the regions do not depend on i or j, a single graph will suffice for 
all the one-sided or symmetrical tests of Table 1. (It is assumed that 
the bounds a and b are the same for each test.) 

DIRECT APPLICATIONS 

One important application of the tests of section 1 consists in using 
these tests as substitutes for the corresponding t-tests. The order sta¬ 
tistic tests are more easily applied, valid under more general conditions, 
and approximately as efficient as the corresponding t-tests. 

A second application occurs in cases where it is reasonably certain 
that the observations are from populations satisfying conditions (A) 
but there is no reason to suppose that the observations have the same 
precision. As an example, consider the examination of a gravimetric 
method of determining the amount of calcium oxide in given samples 
whose CaO content is known (see [4]). The results for 10 given samples 
are as follows: 


CaO Present 

CaO Found By Method 

Ratio 

Mg. 

Mg. 


4.0 

3.7 

.925 

8.0 

7.8 

.975 

12.5 

12.1 

.968 

16.0 

15.6 

.975 

20.0 

19.8 

.990 

25.0 

24.5 

.980 

31.0 

31.1 

1.003 

36.0 

35.5 

.986 

40.0 

39.4 

.985 

40.0 

39.5 

.988 


If it can be assumed that the method used to determine the amount of 
CaO is symmetrical, the above ratios satisfy conditions (A) with mean 
(median) equal to unity when the null hypothesis that the average 
amount of CaO found by the given method equals the true amount 
holds. Since the amoxmt of CaO present varies from 4.0 mg. to 40.0 mg., 
the populations from which the ratios are considered drawn may have 
variances which differ noticeably. Thus application of the ^-test to 
these 10 ratios is a questionable procedure. As conditions (A) are satis¬ 
fied, however, the tests of section 1 are directly applicable. Apply the 
symmetrical test of Table 1 at the 1.0% significance level to these 10 
ratios. Then 1 and 



SIGNIFICANCE TESTS FOR MEDIAN 


349 


max [xg, i{x6 + a^io)] = max (0.990, 0.994) = 0.994 < 1.000 
min [xif ^{xi + Xs)] = min (0.968, 0.950) = 0.950 < 1.000. 

Thus the method examined yields an average value which is signifi- 
cantly different from the true value at the 1 . 0 % significance level. 

USE OF TRANSFORMATIONS 

Consider a situation where the n populations from which the n ob¬ 
servations were drawn are continuous, have the same median, but are 
not symmetrical. (A sample from any continuous non-symmetrical 
population satisfies these conditions.) In practical cases it is sometimes 
kno^vn that replacing each observation value x by the value g{x)^ where 
g{y) is a continuous strictly monotonicaUy increasing function of y, will 
result in a set of observations from approximately symmetrical popula¬ 
tions. If 0 is the common population median for the original observa¬ 
tions, g{4>) will be the population median for the transformed observa¬ 
tions. Thus the transformed observations are from populations approxi¬ 
mately satisfying conditions (A) with median 

Now g{x^ will be the largest of the values of the transformed ob¬ 
servations if X* is the largest of the original observations. Also 

g(<l>)>g{<l>o) if and only if Similarly for gW <g{<l>o) and g{<l>) 

Thus the tests of section 1 are easily modified to obtain tests of 
<l><<l)ot ^>^ 0 , and <l>^<l>o- The procedure followed is to replace Xi by 
g{xt) and by g{<l>o) in the body of Tables 1 and 2 but leave everything 
else as is. For example, forn=9 the one-sided test 

Accept ^<^0 if max[a; 8 , K^T+i^s)] <^o. 
is modified to the test. 

Accept ^<4>9 if max {g(a:s), <fl'(^o). 

As another example let n=8. Then the one-sided test 

Accept ^>^0 min [ 2 : 2 , (. 6 a:i-|-. 28 y 8 -f-. 22 a- 2 )]>^SD. 
is modified to the test 

Accept ^>^0 if min {^( 2 : 2 ), [.6p(xi)-l-.28ff(2:2)-t-.22jr(2:2)]} >ff(^o)- 

The choice of tlie function gly) will usually depend on past experi¬ 
ence with the type of situation being investigated. For example, in some 
cases replacing each observation value by the log of that value has 
been found to sdeld observations from approximately symmetrical pop¬ 
ulations. 

Since only sjunmetry is required, there may exist many suitable 
transformations which have not been used in the past because they do 
not 3 deld observations from approximately normal populations. 

It should be emphasized that tests obtained by transformations are 
not necessarily also tests of the means of the original populations. 



350 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1049 
QUALITY CONTROL APPLICATIONS 

The tests of bection 1 are readUy adapted to control chart use. In 
addition to being valid under extremely general conditions, veiy effi¬ 
cient for normality, and easily applied, the resulting control chart tests 
for the median have the valuable property of being independent of the 
dispersion control chart tests in the sense that the construction and 
application of the median control charts do not depend on any dis¬ 
persion values. 

Two quality control situations are considered: Control with respect 
to a given standard value of the median; control with no standard 
given. 

GIVEN STANDARD VALUE 
• • VALU E OF MAX [Xs/X^fXj/Z] 
i * • VALUE OF MIN D(e.(X,+)Q/ 8 ] 



BELOW STANDARD ABOVE STANDARD 

FIQ. 2. CONTROL CHART OF MEDUN FOR SETS OF OBSERVATIONS OF SIZE 6. 


As pointed out in section 1, if a S 3 mimetrical population has a mean, 
the mean equals the median. Thus if each observation is drawn from a 
population satisfying the additional condition that the mean exists, the 
following quality control tests for the median are also tests for the 
mean. 

A. Control with respect to given standard. In this cahc a standard value 
of the median ^ is given. The method used in constructing and ap¬ 
plying control charts based on section 1 tests is demonstrated by the 
following example: Consider construction of a control chart based on 
sets of observations of size 6. The time t is plotted on the horizontal 
axis of the chart while a function y of the six observations is plotted on 
the vertical axis. The central line on the chart is The function 
y—msx [*s, K® 4 +®«)] is plotted on the chart with dots while the func¬ 
tion y=:min [**, j is plotted with crosses. A dot falling below 



SIGNIFICANCE TESTS FOB MEDIAN 


351 


the central line indicates that the true value of the median is below the 
standard value; a cross above the central line indicates that the true 
value of the median is above the standard value. Fig. 2 furnishes an 
example of how this control chart might look in application. If condi¬ 
tions (A) hold and the true value of the median equals the given stand¬ 
ard value, the probability of a dot falling below the central line equals 
4.7%. Similarly for a cross falling above the line. The probability of 
either a dot falling below the central line or a cross appearing above it is 
9.4%. Thus, if the observations are from populations satisfying condi¬ 
tions (A), the control chart exemplified by Fig. 2 represents a graphical 
method of continually applying a symmetrical test of at the 9.4% 

significance level. If a one-sided test of <^<<#>oat the 4.7% significance level 
is all that is desired, plot only the values of max [ojs, i(a; 4 +a; 6 )] on the 
chart. If a one-sided test of at the 4.7% significance level is 

sought, plot only the values of min [xz, 

The method of constructing control charts outlined above is directly 
applicable to all the section 1 tests. As another example, consider sets 
of size 5. Let a dot be defined as 1.02a:B—.02a:i and a cross by 1.02a;i 
—.02x6. Then Fig. 2 represents a continual application of a symmetrical 
test oi <l> 9^ <t>o with significance level 6% for normality and upper bound 
6.2% for conditions (A). Plotting 1.02xi—.02x6 alone furnishes a one¬ 
sided test of with significance level 2.6% for normality and upper 
bound 3.1% for conditions (A), etc. 

Control charts based on unequal size sets of observations can also be 
readily obtained by use of the tests of section 1. The method used to 
obtain control charts for these cases consists in giving a separate defini¬ 
tion of what a dot and cross are to represent for each set size. The sig¬ 
nificance level can differ according to size or be approximately the same 
for all sizes. As an example of the procedure used, consider a control 


TABIE 3 

DEFINITION OF DOTS AND CROSSES 


Set Sise 

Defmition of Dot 

Definition of Czose 

5 

1 02x« —.02x1 

1 .02xi—.02x1 

6 

.63art+.37ari 

.63xi+.37xs 

7 

max [xf, (,x§+X7)/2] 

min [xi, (xi+ar»)/2] 

8 

max [xi, (x«+xt)/2] 

min [xi, (xi+«4 )/2] 

9 

max [xt, (x*-|-Xf)/2j 

min [x«, (xi+xs)/2] 

10 

max [xr, (x<+xm)/2 ] 

rain [x4, (xi+x«)/2] 






















362 AMBEICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

chart using sets varying from 5 to 10 observations in size. Let the dots 
and crosses for Fig. 2 be defined by Table 3. Then, if conditions (A) 
hold, Fig. 2 represents a continual application of a symmetrical test of 
0 at approximately the 5% significance level. The dots alone fur¬ 
nish a one-sided test of at approximately the 2.5% level while 

the crosses by themselves represent a one-sided at approximately 

the 2.5% significance level. 

B. Control—no standard given {control chart test ). For this case a stand¬ 
ard value of the median is not given. The standard value is replaced by 
an estimate of the true median value made on the basis of past data 
taken while the process was in control. Once a suitable estimate of the 
median is obtained, the determination of control charts for this case 
becomes identical with that considered in section A if is replaced 
by the estimated value of the median. Hence the main problem is to 
obtain a suitable estimate for the median on the basis of the given past 
observations. 

A very satisfactory estimate can be obtained if the past observations 
were drawn from populations satisfying conditions (A) and the addi¬ 
tional condition that the first three moments of each population are 
finite. Then the average of aU the past observations furnishes an esti¬ 
mate of the true value of the median. This estimate has the favorable 
properties: 

1. The expected value of the estimate equals the true value of the 
median. 

2. The estimate tends to the true median value as the number of 
past observations increases. 

From an application viewpoint, the additional condition that the 
first three moments of each population are finite is not very restrictive; 
this condition is satisfied for nearly all populations arising in practice. 

Use of the above estimate in place of 4>o allows the control chart 
methods developed in section A to be utilized. 

GENERALIZED SLIPPAGE TEST 

The usual slippage situation investigated is the following: A change 
is made with affects a continuous population in such a way that the 
shape of the population distribution remains fixed but the population 
mean may move. This slippage of the mean is tested on the basis of 
samples drawn from the population both before and after the change. 

The purpose of this section is to generalize the above slippage prob¬ 
lem and present a solution to the generalized situation. The generalized 
slippage problem is the following: A change is made which affects m 



SIOmSlCANCE TESTS FOB MEDIAN 


353 


continuous populations; h of these populations, (A;=0,1, • • ■, m), are 
affected in such a way that each population has the same shape dis¬ 
tribution before and after the change; the remaining m—h populations 
are symmetrical both before and after the change (but the distribution 
shapes may change). It is required to test whether the change affected 
the values of the means of the m populations. 

The method used to derive this test consists in obtaining n dummy 
observations, (n^6), which satisfy conditions (A) with zero median 
when the null h 3 rpothesis that all the means remained fixed (no slip¬ 
page) is true. Slippage tests with a wide variety of s^ificance levdis 
can then be obtained by applying tests based on conditions (A) to these 
dummy observations. In particular, if ngl5, the tests of Tables 1 and 
2 of section 1 can be used. 

The procedure used to obtain this test is the following: 

(a) Choose r such that 6gmg 15. Then draw r samples 
of size s from each population both before and after the change. 
Record the order in which these samples were drawn. Form 
the mean of each sample. Consider the mean of the sample 
drawn from the population, (i=l, • • •, r; j = l, • • •, m), 
after the change. Subtract this mean from the mean of the 
sample drawn from that population before the change. Under 
the conditions of the generalized slippage problem it is easily 
seen that the resulting m dummy observations satisfy condi¬ 
tions (A) with zero median if there is no slippage. 

(b) m^6. Draw a sample of size s from each population both be¬ 
fore and after the change. Form the mean of each sample. 
Subtract the mean of the sample drawn from thej** population 
after the change, ij=l, * * ■, m), from the mean of the sample 
drawn from that population before the change. The resulting 
m dummy observations satisfy conditions (A) with zero me¬ 
dian if the null hypothesis of no slippage is true. 

In both cases (a) and (b) a set of independent dununy observations 
satisfying conditions (A) with zero median are obtained under the null 
h 3 q)othesis of no slippage. The slippage test is obtained by applying the 
appropriate section 1 test to these dummy observations. 

The tests can be generalized by replacing the condition that all sam¬ 
ples drawn are of the same size by more general conditions. Modifica¬ 
tions of this nature can be made in several obvious ways and will not be 
discussed here. Also in case (a) it is possible to choose r such that 
m>15. The restriction m^l5 was imposed so that the tests of 
Tables 1 and 2 could be used. 



354 AMBRICAX STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 


GENERALIZED SIGN TEST 

A test which has wide application is the well knoivn sign test (see, 
e.g. [5]). This test is used to compare two kinds of objects (say X and 
Y) with respect to a specified characteristic. The comparison is ac¬ 
complished by pairing the two types of objects. In each pair the value 
of the characteristic obtained for the Y object is subtracted from the 
value obtained for the X object. This procedure furnishes n dummy 
observations Xi—yi, • • •, Xn—yn, where n is the number of 
pairs formed. The conditions under which the observations 
yi)} • • •) (^n, Vn) where obtained are such that the dummy obsen'^a- 
tions can be assumed to be independent and to have the property. 

(1) Pr {Xi - > 0) = Pr (xi - yi < 0) = |, (i = 1, • • • , n), 

if the null hypothesis of no difference between X and Y with respect to 
the specified characteristic is true. The sign test is then applied to test 
this null hypothesis on the basis of the signs (plus or minus) of the n 
dummy observations. 

Let us examine condition (1) from the viewpoint of approximate 
verification in practice. The main practical situations (i.e., situations 
approximately satisfied in practice) from which condition (1) can be 
deduced occur when there is reason to believe that each observation 
(Xi, yi) satisfied one or more of the following conditions when the null 
hypothesis is true: 

(a) The joint cumulative distribution F(xi, yi) of Xi and is continu¬ 
ous and such that F{xi, y%)—F(yij Xi); i.e., and y*- receive 
identical treatment. 

(b) Xi and y, are independent and are from continuous symmetrical 
populations wdth the same median value (the two populations 
are not necessarily the same). 

(c) x^ and y*- have the same population median and a normal bivari¬ 
ate distribution. 

In all of cases (a)-(c), the dummy observation a;*—y* is from a con¬ 
tinuous symmetrical population witli zero median (see [6]). Thus, from 
the viewpoint of practical verification, assuming that the n dummy 
observations satisfy condition (1) is almost equivalent to assuming that 
the dummy observations satisfy conditions (A) with zero median. This 
suggests that the tests of section 1 be used for those practical situations 
where the sign test is ordinarily used. The section 1 tests are preferable 
to the sign test from the viewpoint of suitable significance levels and 
efl&ciency for normality. 

It would be very convenient if x—y vrere from a continuous sym- 



SIGNIFICANCE TESTS FOR MEDIAN 355 

metrical population with zero median whenever the observation (x, y) 
satisfies the condition 

(d) X and y come from symmetrical populations with the same me¬ 
dian value; also the joint cumulative distribution of x and y is 
continuous. 

Then conditions (b) and (c) could be replaced by the single condition 
(d) and the problem of deciding when x—y is from a continuous sym¬ 
metrical population with zero median would be considerably simplified. 
The following counter-example, however, shows that condition (d) is 
not sufficient to ensure that x—y is from a continuous symmetrical 
population with zero median: 

Let the joint probability density function of x and y be defined by 

/(*, y) = 7 -+ :^ (1 - 3j/*) if -l^x,y^l 
4 10 

= 0 otherwise. 

Integration shows that the marginal distributions of x and y are both 
symmetrical with zero median. However Pr{x--y<0) is easily shown 
to not equal §. 


BEFERENCES 

[1] John £. Walsh, "Some significance tests for the median which are valid under 
very general conditions,” Annals of Math, Slot., March, 1949. 

[2] John £. Walsh, "On the range-midrange test and some tests with bounded 
significance levels,” Annals of Math, StaUj June, 1949. 

[3] John E. Walsh, “On the 'information' lost by using a t-test when the popula¬ 
tion variance is known,”/our. Am€r,Stat, Assoc,^ March, 1949. 

[4] Wallace M. Hazel and Warren K. Eglof, "Determination of calcium in mag¬ 
nesite and fused magnesia,” Ind, Eng, Chem, Anal. Ed., Vol. 18 (1946), pp. 
759-760. 

[5] W. J. Dixon and A. M. Mood, "The statistical sign test,” Jour. Amer. Slot, 
Assoc., Vol. 41 (1946), pp. 565-566. 

[6] John E. Walsh, "Some significance tests for the median which are valid under 
very general conditions.” Unpublished thesis, Princeton University. 



A SAMPLING STUDY OF THE MERITS OF AUTO¬ 
REGRESSIVE AND REDUCED FORM TRANS¬ 
FORMATIONS IN REGRESSION ANALYSIS 


Gut H. Orcutt and Donald Cochrane 
Department of Applied EconomicSj Cambridge 

This paper is concerned with some aspects of regression 
analysis when the error terms are autocorrelated and there 
exists more than one relationship between the variables. In 
particular, we investigate the merits of autoregressive trans¬ 
formations and the reduced form transformation in dealing 
with these complications. 

An important result is that, unless it is possible to specify 
something about the intercorrelation of the error terms in a set 
of relations and to choose approximately the correct auto¬ 
regressive transformation, a certain amount of scepticism is 
justified concerning the possibility of estimating structural 
parameters from aggregative time series of only twenty ob¬ 
servations. 


1. INTRODUCTION 

T he statisticaIi estimation of the various parameters which enter 
into theoretical formulations of economic relationships is one of the 
main objectives of econometrics and the most common statistical tech¬ 
nique used is multivariate regression analysis. The classical method of 
least squares regression has been shown to give best linear unbiassed 
estimates of the coefficients when certain well known conditions are 
fulfflled. If a linear relationship exists between the dependent variable 
Xu and a set of independent variables X 2 - • • x^ol the form 

V 

(1) Xu = bo + 2 biy^y* + Ut 

y =2 

these conditions are satisfied if among other things^ 

(i) the error term is non-autocorrelated, so that the expected value 

(ii) each of the determining variables Xjt 0 = 2, • • • , p) is inde¬ 
pendent of the error term i.e., E(XjtUt)-0 0=2, • • • , p). 


^ For a complete statement of the conditions under which least squares give liest imbiassed” esti¬ 
mates, see F. N. David and J. Neyman, ^‘Extension of the Markoff Theorem on Least Squares,” Sta¬ 
tistical Research Memoirs, Vol. II, London 1938. 


356 




REGRESSION ANALYSIS 


357 


The formidable complications which have arisen in estimating the 
structural parameters of economic relationships have their origin, in 
so far as they are purely statistical in nature, in the fact that these two 
conditions are not realistic and have to be relaxed in most applications 
to economic data. The major complications which have arisen may be 
classified as follows: 

(a) the auto-correlated error complication; 

(b) the errors in variables complication; 

(c) the simultaneous equations complication. 

The first of these complications arises when condition (i), that the 
errors are independently distributed in time, does not hold while con¬ 
dition (ii), that the determining variables are independent of the error 
term, cannot be maintained when the other two complications are 
present. 

The simultaneom equations approach. The data used in most formula¬ 
tions of economic relationships are obtained from historical time proc¬ 
esses and not from conducted experiments, and as a consequence are 
the results of the solution of a system of simultaneous relations cor¬ 
responding to the economic processes involved. In order to obtain ac¬ 
curate estimates of the structural parameters of any single equation it 
may therefore be necessary to take account of the whole system of 
simultaneous equations in which it occurs. Consider a simple illustra¬ 
tion of this problem. The consumption and price of a commodity enter 
into both a demand and a supply relation so that if we attempt to find 
the demand relation by considering only the regression of the quantity 
consumed on the price of the commodity we are ignoring the fact that 
price is not an independent variable but will depend on the nature of 
the supply relation. Haavelmo* suggested that in such cases the varia¬ 
bles should be considered in a joint normal probability distribution 
which should be studied to clarify the stochastical relationship which 
the system of equations implies. Such a method assumes that the errors 
in the equations arc non-autocorrelabed and normally distributed and 
that there are no errors of observation in the variables. 

It has been shown that for large samples the parameters estimated 
by the method of maximum likelihood from this joint probability dis¬ 
tribution have certain optimal properties. They are asymptoticallyun- 
biassed estimates and are also efficient statistics.® One method of esti- 

< T. Haavelmo, "The Probability Approach in Econometrics,” Supplement to Eeonomdrica, Vol. 
12. July 1944. 

s A more complete discussion can be found in T. Koopmans, "Statistical Methods of Measuiing 
Economic Relationships,” Coupes Commission Discussion Papers Statistics No. 310 (mimeographed 
copy of lectures delivered at the University of Chicago 1947) and T. Haavelmo, "Methods of Measuring 
the Marginal Propensity to Consume,” Journal of the American StaUsUeal Associationt Vol. 42, 1947, 
pp. 103-122. 




368 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

mating these parameters is to rewrite the system of equations in the 
reduced form and solve for each of the endogenous variables in terms 
of the lagged values of the endogenous variables and the exogenous 
variables which appear in the system. These solutions will be in terms 
of linear equations and the method of least squares can then be applied 
by considering each endogenous variable in turn as the dependent 
variable. The coefficients in this form possess the properties of best un¬ 
biassed estimates for large samples. The structural parameters of the 
original equations can then be derived from these coefficients but it 
should be mentioned that it may not always be possible to identify the 
structural parameters of the original relations from the coefficients of 
the equations estimated in the reduced form. The problem of identifica¬ 
tion is a very important one for the method discussed and a careful 
analysis of the system of equations that is being considered should be 
made before attempting any statistical application.^ The estimates ob¬ 
tained from the reduced form method are maximum likelihood solu¬ 
tions for an exactly identified system. 

Autocorrelaied error terms. In an earlier paper® we showed that the error 
terms involved in many current formulations of economic relationships 
are highly positively autocorrelated. In doing so, we demonstrated that 
under these circumstances the application of least squares regression to 
the original data produced very inefficient estimates of the parameters 
to be measured and suggested that this efficiency could be recovered by 
applying an autoregressive transformation to the variables which 
would make the error term approximately random. 

Objects of this paper. In this paper we are concerned with the problem 
of carrying out regression analysis when the error terms are autocor¬ 
related and there exists more than one relationship between the varia¬ 
bles. In particular we investigate the merits of autoregressive trans¬ 
formations and the reduced form transformation in dealing with thes^' 
complications. It is assumed that there are no errors of observation in 
the variables. 

The problems with which we are dealing are essentially deductive in 
nature and the ideal solution to them is one reached by purely deduc¬ 
tive steps from stated premises. However, since it has not been possible 

* See Koopmans, "Statistical Methods of Measuring Economic Relationships,” op. eit. A very good 
description of the practical procedure is contained in M. A. Girshick and T. Haavelmo, "Statistical 
Analysis of the Demand for Food: Examples of Simultaneous Estimation of Structiual Equations,” 
Bconometricat Vol. 15,1947, pp. 79-110. Further references may be found in the various artioles to which 
we have referred. 

* D. Cochrane and Q. H. Orcutt, "Application of Least Squares Regression to Relationships con¬ 
taining Autocorrelated Error Terms,” Journal of the American Sfaffstteof Aeeociation, Vol. 44, 1949, 
pp. 32-61. 




BEGBBSSIOK ANALTSIS 


359 


as yet to obtain such a solution, and considering the problems of some 
importance, we have resorted to the method of sampling experiments. 
That is, we embody our assumptions in experimental models, use these 
models to generate sets of time series and investigate empirically the 
results of various estimating procedures on the series generated. Up 
to the present only large sample properties of the parameters derived 
by the simultaneous estimation of structural relations have been dem¬ 
onstrated. Since economic data rarely comprise series of more than 20 
years, these large sample properties would seem to require more careful 
investigation and the sampling experiment approach provides a con¬ 
venient and legitimate method of making such an investigation. In 
addition it might be mentioned that the use of such methods might also 
provide the answers to many problems which have proved intractable 
to mathematical statistics and the improvements in calculating equip¬ 
ment are very welcome for these purposes. 

2. CONSTBUCnON OF THE EXPEBIMENTAL MODELS 

In order to reduce the computational burden as much as possible we 
worked with the simplest types of systems which seemed at all reason¬ 
able from the standpoint of appl 3 dng any conclusions to economic 
studies. Two models were adopted and are explained as follows. The 
original series of Model I were generated by a recursive system of 
equations 

(2) St = Oo + fflij/t + (Mu + tht) 

(3) j/f = 5o + biXt-i + tijt 

where Xt and yt are the series to be considered and u,t (i <= 1, 2, 3) are 
the error terms involved in the two relations. These error terms were 
generated by the autoregressive equations 

(4) Uit = + e,f (i = 1, 2, 3) 

where the «« denote series of random disturbances. The values of the 
parameters in (2) and (3) were chosen to be 

flo “ “ 0 

(6) Oi = 1.0 

5i = 0.4 

The fit (i—1, 2,3) are independently distributed single digit random 
numbers. They were extracted from Tables of Random Sampling 



360 AUEBICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

Numbers,^ ignorii^ zeros so that they ranged from 1 to 9. Subtracting 
the number 5 from each we obtained three random series ut, possessing 
rectangular distributions with ranges of +4 to —4 and an expected 
value of zero. The three series uu were then generated by applying 
equations (4) with initial values of zero, so that we had three inde¬ 
pendent series of first summations of random elements each comprising 
over 500 terms. Takii^ xo as zero and making use of the properties of 
the system given by (2) and (3), we then generated long series of Xt and 
yt from the three error series u.-i. The filrst five items of the series of Xt 
and yt were discarded and the remaining long series were each divided 
into 20 segments of 21 items with 5 items omitted between segments. 
One of these items was later used for prediction. By this procedure we 
obtained a sample of 20 pairs of series generated by the same underly¬ 
ing autoregressive structure but involvii^ different samples of random 
disturbances. 

For Model II the ordinal series were generated by the same process 
as just described for Model I except that instead of being independent 
the error terms were now highly intercorrelated. This result was 
achieved by using the same series of and Hit ns before but replacing 
the series Wj* by uu so that the true correlation between the error terms 
was 0.71. The recursive system therefore became 

(6) Xt' = Oo + Oiy/ + (wii + Hu) 

(7) yt' = &o + hxt-i' + Hu 

where the constants remained the same as given in (5). The same pro¬ 
cedure was used to obtain the individual sets of Xt and yt as explained 
for Model I. 

Choice of -parameters. Our choice of the autoregressive properties 
of the error series uu was based upon the evidence presented in our 
previous paper^ and the reasonableness of assuming that error terms 
are first summations of random elements has been further supported 
from the results obtained by Stone for a number of demand studies in 
the United Kingdom.* Our choice of the product aibi was made so 
that Xt and yt would have approximately the same autoregressive 


* M. G. KendaU and B. BabingtonrSmitb, ^Tables of Random Sampling Numbers,” Traeta for 
ComptUoro No. £4i Cambridge University Press 1939. 

^ D. Cochrane and G. Orcutt. op. eit. 

s Bichard Stone, *The Analysis of Market Demand: An Outline of Methods and Results” read 
before a meeting of the European section of the Econometric Society at The Hague, September 1948, 
and to be published in The Beoiew of the iTitemeUional Staiia^icoi JtuHsuU. 




BBQBESSION ANALYSIS 


361 


structures as claimed by Orcutt* for the series used in Tinbergen’s^ 
model of the economic system of the United States. For instance in 
the case of independent error terms we can see from (2) and (3)that 
the autoregressive structures of the two series are 

(8) Xt = Xt-i + 0.4(a;t_i - Xt- 2 ) + riu 

(9) Vt = yt-i + 0.4(yt_i — yt-i) + ijji 

where riu and rist are random disturbances defined in terms of €<(({= 
1,2,3). 

Having made the decision as to the product aibi, only one more sig¬ 
nificant decision remains to be made about the general structure of the 
model. This is the correlation between either pair of variables xt and 
yt or yt and Xt-i. For n approaching infinity this may be more clearly 
seen as follows. We may express our model in the form 

(10) Xt = ayt + »u 

(11) yt = bxt-i 

where Xt and yt are in terms of deviations from their means and vu, 
vj( are random error series. Expressing Xt and yt in autoregressive 
forms, we can derive the following relations: 

EM 1 - p*\ EmJ 
EM ^ 1 /p* EM) I A ^ ^ 

EM) 1 - P*\o* EM) / * 

where 


( 12 ) 

(13) 


1 

06 = p, Ri = - and Rt = 

1 ~ 1 ~ 

It can be readily seen that, having decided the correlation between 
say Xt and yt, we have Ri and since the term 

, EM) 

“ EM) 


* G. H. Orcutt, *A Study of the Autoregressive Nature of the Time Series Used for Tinbergen's 
Model of the Economic System of the United States 1919-32,” Jowmal of the Royal StaUetieal Soeulti/, 
Vol. X, Series B. 1948, pp. 1-33. 

10 J. Tinbergen, ‘Statistical Testing of Business-Cycle Theories Vol. 11; Business Cycles in the 
United States of America 1919-32,” League of Nations, Geneva, 1939. 




362 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1049 

appears in both (12) and (13), then R 2 is a function of p and Ri which 
are both known and is automatically determined. All we have left to 
decide is the weights to be assigned to the coefficient a and the relative 
variances of the error terms. We made the relative varianc.es 

EM/EM « i 

so that for ra5^y/=0.44 we have a=oi«l. The resulting form provides 
an intermediate and reasonable model of a simplified economic system. 
On the basis of the calculations needed for this study, it is possible to 
work out the extreme cases in which either of the error terms has zero 
variance. This is done in the next section. 

3. CALCULATIONS INVOLVED AND SOME SPECIAL CASES 

The results of the calculations carried out are contained in Tables 1 
to 6. The equations referred to in these tables are: 

Model I. 

I. Xi = ao aiyt + (wi< + W2O 
(14) II. 2^4 « 5o + biXi^i + legj 

III. Xi ^ po + P\Xu-i + {uit + 

Model IL 

IV. xt' = Co + aij/ + {uu + W2O 

(1®) V* yt = 60 ”1" biXt^i “h U2t 

VI. Xt' = po + PiXu-i' + (uit + (1 + ai)u2t). 

Both these systems are exactly identified. Equations I and IV were 
calculated by the direct application of least squares regression. Equa¬ 
tions II and V are already in the reduced form so that the use of least 
squares is the appropriate procedure. The reduced forms of equations 
I and IV are equations III and VI respectively, so that the reduced 
form estimates of ai and ao are given by 

(16) ai' * pi/bi 

(17) do' — po — bodi' =* act — di'yi 

where Xt and yt are the means of the two series Xt and yt. The calcula¬ 
tions relating to these estimates are given by equations lA and IVA 
in Model I and Model II respectively. For each of the equations we 
have made first and second difference transformations and the regres¬ 
sion parameters have been estimated in the three forms. The original 



BEaBESSION AITALTSIB 


363 


relations and the autoregressive transformations are denoted by the 
letters 0, F.D. and S.D. in the tables. 

SpeddL cases. At the end of section 2 we pointed out that it is possible 
to derive the cases where the variance of either one of the error terms is 
equal to zero. In both these cases the least squares estimates and the 
reduced form estimates lead to identical results. Rewriting the simple 
system of (10) and (11) 

(18) Xt = ayt + vu 

(19) yt = hxt-\ + 

where Vu, vtt are random elements so that the reduced form of Xt is 

(20) X, = pxt-i + Vu + ot»« (p = ob) 

we can say that when (19) is an exact relation then the single equation 
least sq laies estimates and the reduced form estimates of a are the 
same and proportional to the estimate of the autor^ressive coefficient 
obtained in (20). When (18) is an exact relation then the single equa¬ 
tion least squares and the reduced form estimates of a are both exact. 
These equalities may be more clearly seen as follows: 

First assumption E{vi?) =0 
so that our system becomes 

(21) Xt = ayt + 

(22) yt = hxt-i 

(23) Xt =• px«_i + Vu 


and the least squares estimate of b is exact. 

The single equation least squares estimate of a is given by 


(24) 


d = 


,xtyt 


1 53 x^t-i 

T 


b 


53 53 

where p is the least squares estimate of p. The reduced form estimate 
of a is therefore 


(25) 5 = ^ = 4. 

6 


Second assumption E{vx?) =0 


so that our system is now 


(26) 

Xt = ayt 

(27) 

yt == bxt^i + V2t 

(28) 

Xt — pXt^i + (Wit 



364 AMEBICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1940 

and the single equation least squares estimate of a is exact. Now (27) 
and (28) are identical except for a scalar multiplier a, therefore the 
reduced form estimates of a are jS/6 and will also be exact. 

4. GENERAL RESULTS 

In the ensuing discussion we shall be content to point out the gen¬ 
eral and more important features of the calculations and if the reader 
desires further information it may be obtained from the tables which 
are presented in detail. 

Structural parameters. It has been proved by Mann and Wald'^ that 
for large samples a linear stochastic difference equation may be treated 
as a classical regression problem in which the lagged values of the series 
appeal’ as independent variables. However, the adequacy of the or¬ 
dinary least squares regression has not been demonstrated for small 
samples, particularly of the size usually considered by economists, 
and in fact Koopmans^ has mentioned thsit for a sample of 3 items are 
bias will be present in the least squares estimates of the parameters of 
a single lag autoregressive equation. Orcutt^ has pointed out that this 
bias is probably due partly to the necessity of using the sample means 
of the time series instead of the true means and partly to the skewness 
of the distribution of sample estimates even when the true means are 
used. His empirical evidence shows that this bias is very substantial 
for series having only a weak central tendency. If we look at equations 
III and VI in Table 1 we find, for the first difference transformation, 
examples of single lag autoregressive equations. In these cases the 
means of the estimated regression coefEicients are given, and it can be 
seen that the biasses are 2.3 and 2.8 times the standard error of the 
means of twenty estimates. This indicates that even for low values of 
the autoregressive coeflSlcient the bias is still rather large. When we 
estimate the coefficients assuming a true mean of zero we find from 
equations III and VI in Table 2 that the bias is considerably reduced 
but is still far from negligible. 

The question naturally arises as to whether a similar sort of small 
sample bias is to be expected in single equation least squares and re¬ 
duced form estimates of the parameters of a system of recursive equa¬ 
tions. In the previous section we showed that when either one of the 
equations was exact the estimates of both methods were the same. They 
were exact in one case and possessed the same bias and variance as the 

& H. B. Mann and H. Wald. *On the Statistical Treatment of Linear Stochastic Diflerenoe Equa¬ 
tions,” Eeonometrica, Vol. 11, 1943, pp. 173-220. 

T. Ebopmans, "Serial Correlation and Quadratic Forms in Normal Variables,” Annals of Mather- 
moftcol Sfoitsttcs, Vol. 13,1942, pp. 14-33. 

» Q. H. Orcutt, op, eit. 




BEGBBSSION ANALYSIS 


365 











































366 AMEMCAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

coefficient of a single lag autoregressive equation of the type just con¬ 
sidered in the other. It is therefore of interest to examine first our 
Model I which corresponds to an intermediate case from the two ex¬ 
tremes and second our Model II which adds the complication due to 
intercorrelated error terms and accordingly a further bias to the single 
equation estimates by the direct correlation between the independent 
variable and the error series. 

First look at the single equation estimates of the regression coeffi¬ 
cients given in Table 1, by the sets of equations I and IV. The esti¬ 
mates based on the original series are badly biassed and have large 
variances in both methods. As expected the bias is greater in Model II 
where the error terms are intercorrelated. The mean of the variances of 
the regression coefficients estimated from each equation separately 
(see column 11) does not reflect the true position and is only a fraction 
of what it should be. When we make a first difference transformation 
the estimates of the regression coefficient in Model I possess very little 
bias while the mean of the estimated variances of the regression coeffi¬ 
cients also appears to be reasonable. However, in Model II there is still 
a large bias due to the correlation of the independent variable and the 
error term of equation IV. 

Turning to the reduced form estimates given by equations lA and 
IVA we see that they are badly biassed for both the original series and 
first differences of both models. In the case of the original series the 

TABLE 2 

REGRESSION PARAMETERS CALCULATED BY ASSUMING 
TRUE MEAN OP ZERO 
(first Difference Transformation) 



Regression Coefficient 

Correlation Coefficient 





Variance 







Standard 
error ot 






Equation 

True 

Value 

Mean 

Using 

Using 

Mean 

erior of 

Variance 



mean 

mean 

true 

value 


mean 




(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

Moddl 


Mi 

M 


M 




1 

1.0 

WEm 


0.041 

mmm 

0.66 

0.03 

0.014 

lA 

1.0 


mSM 

mESM 


— 

— 

— 

II 

0.4 

HSI 

0.03 


iSSI 

0.58 

0.03 

0.017 

III 

0.4 

m 

0.05 

0.059 

iH 

0.33 

0.05 

0.056 

Moddll 









IV 

1.0 

1.60 

0.03 

0.021 

0.271 

0.84 

0.01 

0.004 

IVA 


0.66 

0.13 

0.352 

0.446 

— 

— 

— 

V 


0.35 

0.02 

0.009 

0.011 

0.61 

0.03 

0.017 

IV 


0.27 

0.05 

0.056 

0.070 

0,27 

0.05 

0.055 


















EEGEBSSION ANALYSIS 


367 


biasses are in the same direction as the single equation biasses but are 
much larger, in the first difference transformation the bias is doAvn- 
wards in both models and is due to the short scries bias previously 
considered. 

So far we have considered the cocflSicients obtained when we esti¬ 
mated the means in each transformation. It is therefore of interest to 
see the bias in the estimates of the first difference transformation when 
we make use of the fact that the true mean of the series is zero. This 
is equivalent to assuming that there is no trend in the original rela¬ 
tionships. The results for the first difference transformation of both 
models are given in Table 2. In the case of Model I the estimates ob¬ 
tained l)y the single equation least squares regression arc not biassed 
but the same bias as previously obtained is present in the case of 
equation IV for Model II. The reduced form estimates arc still biassed 
in both models, although they show <an improvement over the estimates 
obtained when the means arc estimated. 

Our calculations may be used to see whether the means of the re¬ 
duced form estimates are significantly biassed from the true values 
in the first difference transformations. When we calculate the coefficient 
using csiiimatcd means we find from Table 1 that the values 0.62 from 
equation lA and 0.52 from equation IVA arc both significantly differ¬ 
ent from the true value of the regression coefficient at the 6 per cent. 
Icvel,^^ using the standard error of the mean calculated from the vari¬ 
ance around the estimated mean. In fact they arc significantly differ¬ 
ent from the tnie value of the coefficient at the 2 per cent level. When 
we assume that the tnie means of the series are zero only the mean of 
the reduced form cstimat.es for Model II is significantly biassed. From 
Table 2 it can be seen that the value of 0.79 for equation lA is not sig¬ 
nificant at the 5 per cent, level but the value of 0.60 for equation IVA 
is significant, at the 2 per cent, level. 

Consider now the efficiency of t.hc least squarcis and reduced form 
estimates. In Table I we find that the variances of the reduced form 
estimates comi)are very unfavourably in all (wisos with the variances 
of the single c<iuation estimates. This may bo illustrated by the ratios 
of the reduced form variances to the single equation variances. 
Even when we include the effect of the bias on the estimates by cal¬ 
culating the variance around the tnie value of the regression coeffi¬ 
cient, the ratios still remain very high for Model I and although they 
fall slightly in Model II they are still greater than unity. For the cases 

Using the t tables for 20 dogreo of freedom. We are using both tails of the distribution but a case 
con be made for using only one tail. This would halve the levels of significance. 




368 


AMEEICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 


Ratios of variances 
regression estimates 



Using 

Using 


estimated 

true 


mean 

value 

Model 1 

Original series 

20:1 

19:1 

First differences 

8:1 

11:1 

Model II 

Original series 

ij-i 


First differences 

30:1 

3:1 


where w^e assume a knowledge of the true mean, Table 2 shows that 
the ratio of the variances of the reduced form estimates to the vari¬ 
ances of the single equation estimates around the estimated means 
and the true value are 7:1 and 8:1 respectively for Model I and 17:1 
and 1|:1 for Model II. These are still very high ratios and are sur¬ 
prising results. 

In Model I, where the error terms are random and independent, 
the single equation estimates would be the maximum likelihood solu¬ 
tions for large samples and normally distributed error terms^ and wo 
therefore expected the single equation approach to produce better 
estimates than those obtained by the reduced form method. How¬ 
ever, where the error terms are random and intercorrelated the re¬ 
duced form estimates are the maximum likelihood solutions for large 
samples if the correlation between the unlagged error terms is un¬ 
known. This is the case of Model II and we should therefore have ex¬ 
pected the results of Model I to be reversed, but the discussion of the 
last few paragraphs has shown that these expectations have been far 
from realised. 

Aviocorrehiion of residuals. The autocorrelations of the true error 
terms and the estimated residuals are presented in Table 3. The meas¬ 
ure of autocorrelation used is the ratio of the mean square successive 
difference to the variance. This statistic is usually denoted by S^/s^ 
and for a random series it is symmetrically distributed around a mean 
of 2n/n—1 where n is the number of items in the series.^ In a pre\rious 
paper^^ we showed that highly positively autocorrelated error terms 
become strongly biassed towards randomness as the number of parar- 


^ R. Bentsel and H. Woldf *On Statistical Demand Anal 3 nBis from the viewpoint of Simultaneous 
Equations,” SJbaTidinamsfc Aktuariettdakriftf Vol. 29,1946, pp. 95-*114. 

The probability distribution of this statistio has been tabulated; see B. S. Hart and J. von Neu* 
m ai m , ‘Tabulation of the Probabilities for the Ratio of the Mean Square Successive Difference to the 
Varian,” AnnaU cf MaihemaUedL Stotisfics, vol. 13, pp. 207-214. 

D. Cochrane and G. Orcutt, op. eit. 



regression analysis 369 

meters in the estimation relationship increases. This result is again 
illustrated in Table 3. 

Further it should be noted that not only are the residuals biassed 
when the single equation least squares method of estimation is used, 
but this bias appears to be approximately the same as can be seen from 
(column 7) Table 3, when the reduced form method of estimation is 
used. 

TABLE 3 

AUTOCORRELATION OP RESIDUALS 


Autocorrelation 

transformation 


Values of S*/a* 


Actual error series 


Estimated residuals 


Equation 

Single 

equa¬ 

tion 

fl) 

Re¬ 

duced 

form 

(2) 

Values 
for in¬ 
finite 
series 
(4) 

Mean 

(6) 

Stand- 
dard 
error of 
mean 
(5) 

Vari¬ 

ance 

(6) 

Mean 

(7) 

Stand- 
dard 
error of 

mean 

(8) 

Vari¬ 

ance 

(9) 

Uoddl 



■■ 







1 

O 



0.61 

0.09 

0.14 

1.00 

0.10 

0.22 


F.D. 



2.16 

0.10 

0.20 

2.21 

0.10 

0.21 


S.D. 



— 

— 

— 

— 

— 

— 

lA 


0 

0.0 

0.61 

0.09 

0.14 

1.07 

0.10 

0.19 



F.D. 

mm 

2.16 

0.10 

0.20 

1.93 

0.09 

0.18 



S.D. 


— 

— 

— 

— 

— 

*— 

11 

0 


0.0 

0.43 

0.06 

0.08 

0.97 

0.11 

0.22 


F.D. 


2.0 

2.05 

0.09 

0.17 

2.00 

0.08 

0.14 


S.D. 


3.0 




— 

— 


III 

0 


0.0 




1.49 

0.10 

0.21 


F.D. 


2.0 




1.87 

0.05 

0.04 


S.D. 


3.0 

B 

B 

B 

— 

— 

— 

MaddU 










IV 

0 



0.61 

0.09 

0.14 

1.10 

0.13 



F.D. 


2.0 

2.16 

0.10 

0.20 

2.50 

0.10 



S.D. 


3.0 

— 


— 

— 

— 


IVA 


0 

0.0 

0.61 

BS9 

0.14 

1.04 

0.11 

0.24 



F.D. 

2.0 

2.16 

0.10 

0.20 

2.07 

0.11 

0.24 



S.D. 

3.0 

— 

— 

— 

— 

— 

— 

V 

0 


0.0 

0.51 

0.06 

0.08 

1.17 

0.10 

0.18 


P.D. 


2.0 

2.17 

0.09 

0.17 

2.01 

0.08 

0.11 


S.D. 


3.0 

— 

— 


— 



IV 

0 


0.0 

_ 

_ 

— 

1.47 

0.11 

0.22 


■33 


2.0 

— 

— 

— 

1.92 

0.06 

0.06 


S.D. 


3.0 








There is also an additional bias in the residuals caused by the ap¬ 
plication of biassed estimates of the regression coefficients. This may 
be seen in the redduals of the first difference transformation in equa- 




























































370 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

tion IV, Table 3 (column 7). It can be easily shown that the application 
of biassed coefl&cients in this equation will produce negatively auto- 
correlated error terms and this is illustrated in the result obtained. 
A final interesting feature of the autocorrelation of the residuals is to 
be seen in equations III and VI. In the first difference transformation 
we are estimating the coeflBicient of a single lag autoregressive equation 
with random disturbance but it is noticeable that because of the down¬ 
ward bias in the estimate of the coefficient the residuals are not com¬ 
pletely randomized. 

Prediction. In each equation we forecasted the dependent variable 

TABLE 4 


ERRORS OP FORECAST 



Autoregressive 

transformation 

Actual errors of forecast 

Edtimated mean 
variance of 

Equation 

Single 1 
equation 

Reduced 

form 

Mean 

Standard 
errors of 

mean 

Variance 

Error 

Individual 

forecast 


(1) 

(2) 

(3) 

(4) 

(6) 

(6) 

(7) 

Moddl 

■■ 







1 



2.45 

1.09 

23.8 


24.0 




-0.03 

0.66 

8.6 


16.2 




-1.62 

1.10 

24.3 


31.3 

lA 



6.19 

4.67 

417.1 

83.4 

_ 




-1.03 

0.76 

11.6 

18.1 

— 




-6,86 

3.43 

235.7 

198.3 

— 

II 

O 


-1.27 

1.04 

21.6 

11.4 

13.8 


r,D. 


-1.06 

0.67 

9.0 

6.8 

7.6 


S.D. 


-1.12 

0.68 

9.2 

12.3 

13.9 

III 

0 


0.63 

1.36 

36.6 

22.8 

26.7 


F.D. 


-1.18 

1.04 

21.9 

21.4 

23.8 


S.D. 


-2.30 

1.18 

27.8 

30.3 

34.0 

Modd II 








IV 

O 


1.02 

1.09 

25.6 

13.3 

16.4 


F.D. 


-0.66 

0.59 

7.0 

11.2 

12.4 


S.D. 


-1.22 

1.19 

28.2 

28.2 

31.0 

IVA 


o 

1.66 

1.14 

26.0 

14.4 




F.D. 

0.38 

0.91 

16.4 

24.2 

— 



S.D. 

40.56 

34.39 

23664. 

17470. 

— 

V 

0 


0.38 

0.74 

10.9 

10.0 

11.6 


F.D. 


0.02 

0.63 

s.e 

7.4 

8.3 


S.D. 


-0.77 

0.68 

9.2 

11.7 

13.1 

IV 

0 


2.10 

1.07 

23.0 

36.2 

41.4 


P.D. 


0.31 

1.03 

21.0 

34.9 

38.9 


S.D. 


-1.96 

1.24 

30.7 

47.7 

63.9 



















REGBESSION ANALYSIS 


371 


for the next period from a knowledge of the independent variable and 
using the regression coefficients calculated from the series up to that 
period. The forecasts were then compared with the actual value of the 
dependent variable obtained from our constructed series as explained 
in section 2, and the variances of error are shown in Table 4. The fore¬ 
casts based directly upon the single equation estimates of the param¬ 
eters have a smaller variance of error than those based on the direct 
use of the reduced form estimates of the parameters in the set of 
structural equations IA and IVA. This is true for both models and each 
autoregressive transformation. In both models the smallest variance 
of error of forecast is given when using single equation estimates in the 
first difference transformation. 

We also calculated the variance of forecast for the two main equa¬ 
tions of both models in first difference transformation using the know¬ 
ledge that the true means of the series were zero. Table 5 shows that 
these results do not change the general impression and the single equa¬ 
tion least squares estimates still provide the smallest variances of the 
errors of forecast. 


TABLE 5 

ERRORS OF FORECAST ASSUMING TRUE MEAN OP ZERO 
(First Difference Tinnsformation) 



Actual errors of foiecaat 

Equation 

Mean 

Standard error 
of mean 

Variance 

ModdI 




I 

-0.10 

0.58 

6.66 

lA 

0.36 

0 66 

8.64 

Modd 11 




IV 

0.41 

0.54 

5 82 

IVA 

-0 69 

0.90 

16 15 


6. CONCLUSIONS 

Care must be taken when drawing general conclusions from sam¬ 
pling experiments of the type considered in this paper but the results 
appear to us to be of a striking and significant nature. They indicate 
that unless it is possible io specify with some degree of accuracy the 
intercorrelation between the error terms of a set of relations and unless 
it is possible to choose approximately the correct autoregressive trans¬ 
formation so as to randomize the error terms, then a certain amount of 
scepticism is justified concerning the possibility of estimating struc¬ 
tural parameters from aggregative time series of only twenty observa- 







372 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

tions when generated by systems analogous to those examined in this 
paper. This scepticism will be considerably increased if it is also at¬ 
tempted to make a choice of variables and time lags from the same data 

If the error terms are independent or nearly independent between 
equations, the results obtained justify the use of single equation least 
squares estimation, in which case the main problem lies in making the 
correct autoregressive transformation. This would seem to be the 
situation as regards many demand relationships, particularly those 
which are in terms of current variables and those relating to agricul¬ 
tural products. 

For short run prediction we would appear to be in a somewhat more 
favourable position since the estimated variance of errors of individual 
foiucasts obtained from single equation regression analysis seem to be 
in line with the actual errors and as small as could be expected even 
in the absence of a simultaneous equations complication. However, 
it must be remembered that such predictions assume that the same 
system will continue* 



CONTROL OF A GENERAL CENSUS BY MEANS 
OF AN AREA SAMPLING METHOD 


Gabbibl Chbvrt 

InstUut National de la Statiatique et des Sludes jSconomiqtteSf Paris 

The area sampling method was applied to industrial and 
commercial enterprises in urban areas of France to determine 
the adequacy of the general census in enumerating them. 
The investigation showed an under-reporting by the census 
of enterprises of all sizes. 

A GENERAL CENSUS of the population was taken in France on March 
10, 1946. Like the previous ones, this census involved in effect 
several simultaneous inquiries about people, households, dwellings, 
industrial and commercial enterprises, and agricultural establishments; 
different questionnaires were used for each of these categories of 
statistical units. 

Questionnaire No. 5 applied to the industrial and commercial enter¬ 
prises answering the following definition: an enterprise consists of a 
group of two or more persons working together permanently, in a 
permanent place, under the management of one or more representa¬ 
tives of the same trade name. According to this definition, a person 
working alone did not constitute an enterprise. Two partners, or a 
husband and wife working together, without assistants, constituted 
an enterprise employing no salaried staff. Each of the various subsidi¬ 
aries of the same firm constituted a separate enterprise, even if they 
were located in the same community. 

A first reckoning of the questionnaires No. 6 which had been filled 
through the census procedure revealed some anomalies, which indi¬ 
cated that more than a negligible number of enterprises had not filled 
their answers to the census questionnaires. In order to have a more 
precise impression and to be able to appraise quantitatively the ac¬ 
curacy of the census of industrial and commercial enterprises, and 
management of the Direction dcs Enquetes economiques of the 
Institut national de la Statistique decided in October 1946 to resort to 
its regional oflBiccs for an a postman control of the census process by 
means of an area sampling method. 

PRINCIPLES OF THE METHOD 

To reduce the cost of the process, the control was limited to the 

373 



374 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

urban areas in which the Institut national de la Statistique has regional 
offices. It could be presumed that the census of March 10, 1946, had 
been more accurately taken in the rural areas than in the large cities. 

The control consisted simply in selecting at random in each of these 
cities, a certain number of areas in which the sampling was to take 
place, in listing the enterprises located in the selected areas, and in send¬ 
ing investigators to visit all these areas to make an inventory of all the 
enterprises located therein and to obtain completed questionnaires 
No. 5 from those enterprises which had not submitted their answers to 
the census. 

For each city the number of areas selected was considered adequate 
when, according to the census data, the number of enterprises located 
in these areas was at least equal to 2.5 per cent of the total number of 
enterprises located in the whole urban center. 

TECHNIQUE OP SAMPLING 

The first step consisted in setting the limits of the urban centers 
where the sampling was to take place. It was required that the sam¬ 
pling not be limited to the city in which the regional office of the Insti¬ 
tut national w'as located, but should extend to the whole urban center 
in which the city was included, this center consisting of all the small 
communities or bounding parts of these communities in which the 
inhabited lots of these divisions were adjoining or joined together by 
parks, gardens, orchards, yards, w^orkshops or other similar enclosures, 
even if these houses or enclosures were separated from each other by a 
ditch, a river or a public garden. 

A large scale map of the urban center thus delineated Avas covered 
Avith a regular orthogonal grid forming squares of 250 meters per side 
for the smaller centers and 500 meters per side for the very large ones. 
The squares covering areas containing construction were numbered 
and a certain number of squares were taken at random by means of 
Tippett tables. 

As it Avas impossible to locate on the terrain the limits of the selected 
grids, instructions were given to substitute for each of these grids, 
polygons of approximately equal areas but bounded by actual streets. 
The chart below shows two examples of this substitution, which 
was indispensable in order to: a) facilitate the listing of the enterprises 
located in each of the grids which had filled the questionnaire; and b) 
point out unambiguously the buildings which were to be visited by the 
investigators. 

All the buildings located in the polygons thus selected and delineated 



A.BBA SAMPLING OF A CENSUS 


376 


were visited by the investigators. Each of the investigators was pro¬ 
vided with a list of the enterprises which, according to the census, were 
located in the group of buildings assigned to him. He checked on this 
list the enterprises which had completed answers to Questionnaire 
No. 5 for the census authorities, and those which had not been polled 
by the census. 

As this procedure took place in November 1946, several months after 
the general census, special instructions were given to the investigators. 



SUBSTITUTION OF POLYGONS FOB SQUARES. AREAS INCLUDED 
IN POLYGONS ARE SHOWN AS SHADED REGIONS 


SO that they could distinguish, among all of the enterprises existing in 
November 1910 and not having replied to the census questionnaire, 
those which had been in existence in March 1946 and thus should have 
been included in the census list, and those which had been created or 
reopened between March and November. Conversely, it was hardly 
possible to discover in November the number of enterprises which 
had been closed since March. This fact seems of no consequence, how¬ 
ever, as bankruptcies were very scarce in this period after the war. 

RBSUUrS OP TUE SAMPLING 

Table 1 shows the gross results of the sampling for each of the se¬ 
lected cities and for the sample as a whole. 

In the 18 urban centers where the sampling control was used. 



376 AMERICAN STATISTICAL ASSOCIATION JOUBNAIj, SEPTEMBER 1010 


TABLE 1 

GROSS RESULTS OP THE SAMPLE 


Number of ontorprisoB existing; in the sample 


Urban Centers 

a) 

i 

Total number j 
of enterprises 
included in 
the census of 
March 1946 in 
these centers 

(2) 

Included in the census 
of March 1946 

Not included in the census 
of Match 1946 

Number 

(3> 

As a per¬ 
centage of 
the total 
number of 
enterprises in 
the census 

Number 

(5) 

As a per¬ 
centage of 
the number 
of onferpriaea 
in the 
sample 

in 

(61 

^“(3)+(6) 




% 


% 

Bordeaux. 

10,656 

365 

3.4 

97 

21.0 

Clermont. 

3,186 

127 

4.0 

52 

29.1 

Dijon. 

2,366 

60 

2,6 

29 

32.6 

Lille. 

7,649 

228 

3.0 

188 

45.2 

Lunogea. 

2,870 

105 

3.6 

34 

24.5 

Lyon. 

14,280 

405 

2.8 

369 

47.6 

Marseille. 

14,365 

531 

3.7 

182 

25.5 

Montpellier. 

3,281 

161 

4.9 

61 

27.5 

Nancy. 

3,272 

88 

2.7 

27 

23.5 

Nantes. 

5,005 

195 

3.9 

19 

8.9 

Orleans. 

2,748 

100 

3.9 

17 

13.5 

Paris. 

110,000 

3,115 

2.8 

529 

14.5 

Poitiers. 

905 

53 

5.9 

38 

41.8 

Reims. 

3,233 

120 

3.7 

9 

7.0 

Rennes. 

2,835 

89 

3.1 

23 

20.5 

Rouen. 

5,351 

160 

3.0 

87 

35.2 

Strasbourg. 

6,958 

128 

2.2 

60 

31.9 

Toulouse. 

4,680 

115 

2.5 

82 

41.6 

Total. 

201,359 

6,154 

3.1 

1,003 

23.6 


201,359 enterprises had been included in the census of March 1946. 
The selected sample included 6,154 enterprises, i.e. 3.1 per cent of the 
total. 

The investigation made on the spot has sho^vn that 8,057 enter¬ 
prises existed in this sample. This means that 1,903 enterprises, or an 
amount equal to 30.9 per cent of the enterprises included in the census 
and 23.6 per cent of the existing enterprises, had been overlooked in 
the census of March 1946. 

Sampling theory permits one to determine the accuracy of this gross 
result. It does not seem possible to use unreservedly the formula which, 
in the case of a cluster sample gives the variance of an estimate of 
probability, p; (p, in this case, would be the probability of an enter¬ 
prise being overlooked in the census). This formula supposes in effect 
























AREA SAMPLING OF A tiBMSVS 


377 

that all the clusters are of the same size, i.e. include the «nTnA number 
of enterprises. It is obvious that, with the method applied in the sam¬ 
ple, this condition has not been fuUfiUed. 

To measure the accuracy of the results under these circumstances, 
it seems more advisable to apply the approximate formula provided 
by Goldberg.' This formula deals with the errors in estimating a 
ratio xjy, when the ratio is obtained as the quotient of estimates of 
its numerator and denominator, both computed from at least n ele¬ 
ments. This estimate implies a bias whose average relative value is 

C.* - pCxC, 
n 

and the relative standard error of s amp lin g is 
y'C'.* + C,* - 2pC'A 

where 

Cx is the coefficient of variation of x, namely Vx/S, 

£ is the arithmetical mean of the values of a; in the universe C„, 
Cy is the coefficient of variation of y namely <r,/y, and 
p is the coefficient of correlation between x and y. 

In the particuliir case which we are considering, n is the number of 
polygons, bounded by streets, which were taken at random; the values 
of y are the number of the enterprises existing in each of these poly¬ 
gons, and the values of x are tlie number of enterprises overlooked by 
the census in the same polygons. 

The coefficient of correlation is rather high: 0.91. The average rela¬ 
tive value of the bias is 2/1000, i.e. practically negligible. The relative 
value of the standard error of the sample is 7.6 per cent.* 

It can therefore lie estimated that in 95 cases out of 100 the per¬ 
centage of cntcriirises overlooked in the census lies within the limits of 
23.6(1—2X0.070) or 20 percent, and 23.6(1-1-2X0.076) or 27.2 per 
cent. Thus we can state that, in the census of March 10,1946, 20 to 27 
per cent of the industrial and commercial enterprises were overlooked. 

It has already been pointed out tliat o priori it does not seem sound 


>Cf: *Mahodc6 atatinliaaos modoroes des AdminutiatioDa fid&aloa aiix fitats-Unis* by P. 
Thionet (Hermann dc Cie, Paris, 1046) p. 58. and ‘‘Sampling theory when the sampling units are of 
unequal sizes” by W. G. Cochran {Jmtmal of the American ^atistical Association, June 1042). 

> Actually, those calculations could be made for only 14 out of the 18 cities in which the sampling 
took place. In the other 4 cities, we could not use the results relative to each of the selected polygons. 
This in no way nullifies the conclusions which can be drawn from the results, however, as the degree of 
error which was based on the data for 14 towns was greeter than that found for the 18 towns. 




378 AMEBICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1049 

to apply this result to the entire French territory; the sampling has 
been limited to the large cities and there is no reason to believe that 
the census of enterprises has been as incomplete in the rural areas as 
in the cities. Furthermore the high correlation between x and y might 
indicate that the converse is true; namely the denser the population of 
enterprises, the greater the number of omissions. 

COMPLEMENTARY RESULTS 

Although it may be interesting to know that in the large cities 20 
to 27 per cent of the enterprises were overlooked in the general census, 
this information is nevertheless insufficient. One must ask oneself how 
and why the census has been incomplete. In particular, the first ques¬ 
tion that presents itself is the following: Have the enterprises been 
overlooked in the census because of their nature or their size? One 
might in fact think that principally the small enterprises have been 
omitted. 

An answer to this question, at least with regard to the size of the 
enterprises, is furnished by Table 2, which compares the distribution 
according to the number of their salaried employees, of the 756,236 

TABLE 2 

DISTRIBUTION, ACCORDING TO THE NUMBER OF THEIR SALARIED 
EMPLOYEES, OP THE 756,235 ENTERPRISES INCLUDED IN THE 
CENSUS AND THE 1,623 ENTERPRISES NOT INCLUDED IN 
THE CENSUS BUT REVEALED BY THE SAMPLE 


Number of 
salaried 
employees 

Enterprises included in 
the census 

Enterprises not included in 
the census but appoarinR 
in the sample 

Number 

Per cent of total 

Number 

Per cent of total 

0 

230,354 

30 

.397 

25 

1 

197,252 

26 

546 

33 

2to5 

214,504 

29 

44.3 

27 

6 to 10 

48,561 

5 

100 

6 

11 to 20 

29,188 

4 

65 

4 

21 to 50 

21,294 

3 

47 

3 

51 to ICO 

8,001 

1 

15 

1 

More than 100 

7,081 

1 

10 

1 

Total 

756,235 

100 

1,623 

100 


enterprises included in the census of March 1946 covering France's 
whole territory, with an analogous distribution of the 1,623 enterprises 
which the sample revealed were omitted from the census. 

The two series of percentages show a very satisfactory agreement for 
all the enterprises having 2 or more salaried employees. For the two 




AREA SAMPLING OP A CENSUS 


379 


other catgoories (0 and 1 salaried employee) the agreement is less 
satisfactory. It does not seem necessary, however, to attach much im¬ 
portance to this fact. First of all, the percentages resulting from the 
sample can be considered as only approximate, in view of the standard 
deviation of the sample. Moreover, the distribution of enterprises be¬ 
tween these two categories is subject to some uncertainty. The enter¬ 
prises comprising only two persons could be classified either as an en¬ 
terprise consisting of two employers and consequently no salaried 
employee, or as one consisting of one employer and one salaried em¬ 
ployee, according to the manner in which the informant interpreted 
the questionnaire. It is, in addition, quite possible that, particularly 
for tax purposes, an enterprise having 2 employers and no salaried 
employee had claimed having one employer and one employee, e.g. 
that the employer’s wife had been erroneously recorded as a salaried 
assistant. 

Under these circumstances, it seems wiser to put together the first 
two categories of Table 2. In this way, one obtains percentages of 
66 per cent for the census and 58 per cent for the sample, which may be 
considered as sufficiently alike. 

Thus it seems possible to conclude that, if the census of 1946 has 
been inadequate, it has been so for the enterprises of all sizes. According 
to the information gathered by the investigators of the sample, the 
fact that such a high percentage of enterprises was overlooked in the 
census was attributable to negligence on the part of the census enumer¬ 
ators. 

For these reasons the Institut national de la Statistique has not 
published the results of the 1946 Census of industrial and commercial 
enterprises. Since then, others methods have been used to examine 
this subject. A permanent, inventory of these enterprises has been 
cstablislw'd. It will give a more complete and exact picture of the in¬ 
dustrial and commercial atructuro of France. 



A PROCEDURE FOR OBJECTIVE RESPONDENT 
SELECTION WITHIN THE HOUSEHOLD* 

Leslie Kish 

Survey Research Center, University of Michigan 

In modern survey methods growing emphasis is placed on 
the objective selection of the sample. For surveys of the gen¬ 
eral population, increasing use is made of area sampling to ob¬ 
tain probability samples of households. Heretofore, scant 
attention has been given to the question of how to make an 
objective selection among the members of the household. 

A procedure for selecting objectively one member of the 
household is given as used in four surveys of the adult popula¬ 
tion. Demographic data as found in the sample are compared 
with outside sources for available factors. 


The Problem 


T o OBTAIN random samples for surveys,^ two basic conditions must 
be satisfied: 

1. The sampling method must provide for a known probability of 
inclusion, other than zero, for every element of the population. 
2. The method must be translated into a procedure that can and 
will be applied in practice. 


Area sampling is gaining acceptance as a practical and reliable pro¬ 
cedure for obtaining samples of households with known probabilities 
of selection.* In general practice these probabilities are equal either 
within specified strata or throughout the sample. 

Given a sampling procedure—such as area sampling—which uses 
dwellings as units of sampling: when does the question of selection 
within households arise? There is no need of selection: 


* Presented at the 107th Annual Meeting of the American Statistical Association, New York City, 
December 30.1947. 

1 For discussion of sources of biases that may arise when personal judgment enters into the selec¬ 
tion see: Hauser, Philip M. and Hansen, Morris H. ‘On Sampling in Market Surveys,* of Marked 

inOf July 1944, pp. 26-31. 

> King, A. J. and Jessen, K. J., ‘The Master Simple of Agriculture,* Journal of (he American 
Statislical AaeodatUm, Vol. 40, No. 229, March, 1945, pp. 38-42. 

Hansen, M. H. and Hauser, P. M., ‘Area Sampling—Some Principles of Sample Design,” PtMie 
Opinion Quartady, Summer, 1945, pp. ISS-IOS. 

Houseman, Earl E., ‘The Sample Design for a National Farm Survey by the Bureau of Agricul¬ 
tural Economics,” Journal of Farm Bconomice, Vol. XXIV, No. 1, February, 1947, pp. 241-245. 

‘A Chapter in Population Sampling” by the Sampling Staff of the Census Bureau, 141 pages. 
Superintendent of Documents. 


380 




rkspondent selection 381 

A. If the respondent is uniquely determined (as head of household 
or homemaker); or 

B. If the household is the unit of analysis and any adult member can 
give equally valid information.* 

However, if the household contains more than one member of the 
desired population, it may be regarded as a cluster of population 
elements. 

One may decide to include in the sample every member of the popu¬ 
lation within the household.^ However, this may be a statistically in¬ 
efficient procedure in general (depending on the cost-variance relation¬ 
ship of the survey design), unless one of these three conditions holds: 

A. Information about all members can be obtained from one of 
them. 

B. There is seldom more than one member of the population in the 
household. For example: there is an average of 1.2 spending units 
per household.® 

C. If the intra-class correlation within the household of the variables 
measured is of negligible size, or if it is negative. 

These conditions arc not met generally in surveys of the attitudes of 
the adult population. Furthermore, multiple interviews in one house¬ 
hold may lead to undesirable interview situations. Hence, there is 
need for a procedure of selection that will translate a sample of the 
households into a sample of the adult population. There are no great 
theoretical difficulties, but a practical procedure must meet the de¬ 
mands of efficiency and of applicability. There are several alternatives, 
and the choice among them depends on a number of factors: the nature 
and distribution of the population, the objectives and design of the 
survey, and the available facilities. Under the latter wo may distinguish 
the factors of cost, and of the nature and training of the field force. 

The Condiiiom 

It may be useful to describe very briefly the general sampling pro¬ 
cedures used by the Survey Research Center.® A stratified random 


> This assumption may be unjustified. See: Deming, W. Edwards, ”On Errors in Surveys. * Amaiean 
SoGidogieal Review, August. 1944. p. 361. 

« See. for example: Watson. Alfred N., "Bespondent Pre-Selection within Sample Areas.* No. 2 
of Technical Series on Statistical Methods in Market Research, page 7, Research Department. Curtis 
Publishing Company, Philadelphia. 

* Based on the Consumer l^nances Surveys of the Federal Reserve Boaid. conducted by the 
Survey Research Center. 

• For a fuller description see: Goodman, Roe, 'Sampling for the 1947 Survey of Consumer Fi¬ 
nances.* Journal of the American Statietieei Aeeoeiation, September. 1947, pp. 489-448. 




382 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1940 

selection of dwellings is obtained by means of area sampling. Small 
areas, segments, define the clusters of households within counties (or 
groups of adjacent counties) selected as primary sampling areas. In 
each of these areas there are trained interviewers who are employed 
on a part-time basis. Detailed sampling instructions are given to them 
in order to insure correct understanding and execution of the different 
procedures required by a variety of types of surveys. 

The interviews are of the fixed question, free-answer kind, requiring 
generally from 30 minutes to one and one-half hours. Conducting the 
interviews in the home is believed to contribute toward a satisfactoiy 
interview situation. The interviewer calls at pre-designated dwellings; 
the respondent is then selected by a fixed procedure. Return calls are 
made to find the not-at-homes. 

For each of a series of four surveys, conducted in June, August and 
December, 1946 and April, 1947 respectively, it was desired to obtain 
samples of about 600 interviews of the adult population of the Con¬ 
tinental United States. The samples were distributed in 31 primary 
sampling areas: in four of the 12 largest metropolitan areas plus 27 
scattered counties. 

The population of the four surveys was limited to adults living in 
private households, excluding, because of interviewing difiSculties and 
cost considerations, some segments of the population: armed forces; 
hospitals; religious, educational and penal institutions; trailer, logging 
and labor camps; and hotels and large rooming houses. 

A procedure was devised for selecting one adult in each sample 
household. This procedure was favored over some alternatives (such as 
selecting every other adult found in sample households) for two 
reasons: 

1. It was desired to take no more than one interview in any house¬ 
hold, in order to obtain each interview before the respondent had 
a previous opportunity to discuss the questions. Furthermore, 
multiple interviews were thought to be statistically inefficient 
because of the expected correlation of attitudes within the house¬ 
hold. 

2. An interview in every sample household was desired in order to 
avoid making futile calls on dwellings without interviews. 

With this procedure unbiased estimates may be obtained by giving 
each respondent as weight the number of adults in the household. Such 
differential weighting may in general increase the sampling error. 
However, in the present instance this increase is not great because of 



BESPONDBNT SELECTION 


383 


the concentration in two-adult households. About 60 per cent of the 
households have two adults, about 10 per cent have more than three 
adults and about 1 per cent have more than five. Another result of this 
high concentration is that, unless the variable has a high correlation 
with the number of adults in the households, the difference between 
the weighted estimate and that in which each respondent has equal 
weight will be small; in case of small samples this difference may be 
negligible compared to the sampling error. For all attitudes thus tested 
in these studies the difference was inconsiderable compared with 
sampling error. However, in the comparison given in Table I there ap¬ 
pear two differences of about two percentage points. 

In order that the procedure may be applied and checked without 
great difficulty it is desirable to have a variable for ordering the mem¬ 
bers of the household; a variable than can be obtained by the inter¬ 
viewer objectively and easily. The age and sex of the members of the 
household were used for this purpose. 

The Procedure 

A “face sheet" is assigned by the sampling section to each sample 
dwelling unit; on each of these there are, in addition to the address of 
the dwelling, a form for listing the adult occupants, and a table of 
selection. At the time of the first contact with the household, the inter¬ 
viewer lists each adult separately on one of the six lines of the form; each 
is identified by entering in the first column his relationship to the head 
of the household (wife, son, brother, roomer, etc.). In the next two col¬ 
umns the interviewer records the sex and (if needed) the age of each 
adult. Following this the interviewer assigns a serial number to each 
adult: first the males are numbered in order of decreasing age, follow¬ 
ed by the females in the same order. To assign these serial numbers it 
is necessary to obt.ain the ages of all adults only in that small portion 
of households in which there arc two adults of the same sex and not 
connected by i)arcnt-child relationship. 

Then the iulcTviewcr consults the table of selection; this table tells 
him the number of the adult to be interviewed. One of the six tables 
(A to F) is printed on each face sheet; each of the six tables is assigned 
to one-sixth of the sample addresses in a systematic manner. 

Tables to select the person to interview: 

Each of these tables was assigned to one-sixth of the sample ad¬ 
dresses in the four surveys. 

A Number of adults in D.U. 1 2 3 4 6 6 or more 

Interview adult numbered: 1 1 3 2 5 1 



384 


AHEKICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

B Number of adults in D.U. 1 2 3 4 5 6 or more 

Interview adult numbered: 12 13 4 2 

C Number of adults in D.U. 1 2 3 4 5 0 or more 

Interview adult numbered: 112 4 13 

D Number of adults in D.U. 1 2 3 4 5 0 or more 

Interview adult numbered: 1 2 3 I 2 *1 

E Number of adults in D.U. 1 2 3 4 5 6 or more 

Interview adult numbered: 112 13 5 

F Number of adults in D.U. 1 2 3 4 5 6 or more 

Interview adult numbered: 12 14 3 6 


The slightly changed order of selection in the tables given below has 
certain advantages over that of the above tables: 

The low numbers of selection are concentrated in tables A, B and C 
therefore the procedure will yield a male respondent in a great majority 
of the addresses to which these tables have been assigned. Evening 
calls are necessary to find at home most of the male respondents; the 
interviewer may concentrate his evening calls at these addresses. Con¬ 
versely the interviewer may use his time during the day by calling at 
addresses to which tables D, E and F are assigned, with an increased 
chance of success. 

The proper fractional representation of each adult is approximated 
more closdy without the necessity of printing many more forms; the 
chances of selection are exact for all adults in households with 1, 2, 3, 4, 
and 6 adults. Because numbers above six are disallowed, there are 
one or two adults in a thousand (generally young females) who are 
not represented; there is a “compensation” for them in the over- 
representation of number five in the households with five adults. 


Relative 
frequency 
of Use 

Table 

Number 

1/6 

A 

1/12 

B1 

1/12 

B2 

1/6 

C 

1/6 

D 

1/12 

El 

1/12 

E2 

1/6 

P 


If no. of adults in household is 




2 

3 

4 

5 

6 

or more 

Select adult numbered 

1 

1 

1 

1 

1 

1 

1 

1 

2 

2 

1 

1 

2 

2 

2 

1 

2 

2 

3 

3 

2 

2 

3 

4 

4 

2 

3 

3 

3 

5 

2 

3 

4 

5 

5 

2 

3 

4 

5 

6 




















RESPONDENT SELECTION 385 

It may be noted that the procedure can be modified easily to select 
a constant proportion, say half or one-third, of the adults. In that 
case, of course, each adult would have the same chance of selection, 
regardless of the size of the household. In some of the households more 
than one interview would be taken, in some no interviews at all. The 
above tables may be modified readily to apply the changed procedure. 

Results: Checks against outside sources 

The distributions of the respondents were checked with outside 
sources whenever we could obtain or adapt reliable data for valid com¬ 
parison. They are presented in Tables I and II. 

TABLE I 


COMPARISON OF RESPONDENTS IN SAMPLE WITH CHECK DATA 




Total for 

4 Surveys 

N -2372 

Data from the Surveys of: 


Check 

Data 

June 

1040 

N-685 

Aug. 

Dec 

1016 

N-570 

April 

1947 

N-626 



Weighted 

Unweighted 

N-502 

White 

90.6 

89.1 

89.4 

89 

90 

88 

89 

Native born 

90.0 

90.6 

90.8 

90 

91 

91 

90 

Age in years 








21-29 

22.8 

22.7 

20.6 

22 

24 

22 

22 

30-44 

33.9 

33.0 

34.1 

34 

33 

36 

30 

45-59 

25.6 

27.3 

26.7 

20 

27 

25 

32 

60 and over 

17.7 

16.1 

17.8 

17 

16 

17 

16 

NA 


0.9 

0.8 

1 

1 

0 

1 

Education 








Not finished grammar 

46.1 

26.7 

26.2 

24 

26 

24 

30 

Fininhed grammar 

19.1 

19.3 

17 

17 

22 

20 

Some high 

17.4 

17.1 

17.4 

18 

19 

18 

14 

Finiuhed high 

22.9 

20.4 

20.0 

20 

21 

18 

22 

Some college 

7.1 

9.9 

9.4 

11 

11 

10 

7 

Finiahod college | 

5.0 

7.0 

6.9 

9 

6 

7 

6 

NA 

1.5 

0.8 

0.8 

1 

1 

1 

1 


The check data are those deemed most nearly valid for comparison: 
the per cent white and the educational attainment arc for April 1947 
from Census Series P-20 No. 16, age is for July 1946 from Series P-S 
No. 19, and the per cent native bom is the 1940 census figures for ages 
16-64. In the data obtained from the survey results each respondent is 
weighted by the number of adults (from one to six) in his household. 
However, the column “unweighted”, in which each respondent has 
unit weight, is included for purposes of comparison. 

The appropriate sampling error (i-e. two standard errors) of the 














386 AMEKICAN STATISTICAL ASSOCIATION JOUBNAL, SEPTEMBER 1949 


estimates in Table I range from three to six percentage points for 
various items on each of the surveys, and from two to four percentage 
points for the four surveys combined. The estimates of color, nativity 
and age groups are in general agreement, with only one of the 30 esti¬ 
mates lying slightly beyond the given ranges.^ 

The data of the first three surveys showed an apparent upward bias 
in reports of college education. This was suspected to be a response 
bias to the single question on this topic. For the fourth survey three 
more questions were inserted to get more data on persons claiming 12 
years of school or more; this involved additional work on only a third 
of the respondents. The fourth survey shows close agreement with the 
check data on college education, pointing to elimination of a response 
bias.^ 


Check of the sex ratio 

TABLE n 

PERCENTAGES OF MALES 


Check Data: 
Census 
Estimate 

All Adults 

Respondents in Sample Weighted by Number in Household 

in Sample 
Households 

4 Surveys 

June 

August 

December 

April 

July, 1946 

Combined 

1946 

1946 

1946 

1947 

48.2 

47.9 

46.8 

45 

46 

46 

52 


The Bureau of Census estimate is for the civilian non-institutional 
adult population for July, 1946 (series P-S No. 19). The second figure is 
an internal check, obtained by tabulating all adults listed as living in 
sample households. 

Males appeared to be underrepresented among the respondents of 
the first three surveys. Although the difference was small, its presence 
in three surveys pointed to possible occasional deviation from rigorous 
procedure in the field. It may be noted that a 3 per cent bias in the 
sex ratio would require an unlikely 33 per cent sex difference in an at¬ 
titude studied in order to produce a 1 per cent difference in survey 
results; a difference within the limits of accuracy of these surveys. 

Tabulations of all adults in the sample households, as listed on the 
face sheet, gives close agreement with the check data. Continued in¬ 
vestigation of the problem points to two sources of this small bias, both 
due to the fact that males are more difficult to find at home even with 
repeated call-backs: overrepresentation of males among the non- 

T These statements are also borne out by zesults on three more national surveys in which the same 
procedure was used. 

















RESPONDENT SELECTION 387 

responses, and an occasional substitution on part of a few interview¬ 
ers. 

Two other small sources of error present in the first three surveys 
were eliminated by minor changes in the procedure. 

Conclusion 

The described procedure of selection within the household gave 
results that were satisfactory within the demands of the survey ob¬ 
jectives. While there were occasional departures from correct proce¬ 
dure in the field, the procedure is such that extensive control over its 
field application, hence improvement, is feasible. It must be empha¬ 
sized, however, that a practical sampling procedure is not an auto¬ 
matic device. For success it depends on a field force having both the 
training and the morale necessary for correct application. 



BENEFICIARY STATISTICS UNDER THE OLD-AGE 
AND SURVIVORS INSURANCE PROGRAM AND 
SOME POSSIBLE DEMOGRAPHIC STUDIES 
BASED ON THESE DATA* 

Robebt J. Mtebs 

Chief Actuary^ Social Security Administration^ Washington, D. C, 

The old-age and survivors insurance program, under the 
Social Security Act, is a nationwide social insurance system 
covering most employment in industry and commerce. It is 
possible that in the near future this coverage will be extended 
to virtually all employment in the country since almost all 
interested organizations are in favor of this. From the ex¬ 
tensive scope of the present limited coverage and also of the 
anticipated, practically universal future one, there naturally 
will emerge over the years a very large number of benefi¬ 
ciaries, both retirement and survivor cases. This paper will 
first present a limited amount of background regarding the 
present program. There will then be set forth the sources and 
types of data now being presented or available; finally there 
will be described some demographic studies that have been 
made, as well as others which are possible from these data. 

Coverage of the Program 

T he employment coverage of the present system includes substan¬ 
tially all wage and salary workers in industry and commerce ex¬ 
cept those engaged in railroad employment for whom a separate sys¬ 
tem has been set up. The major employment categories excluded from 
coverage are agricultural, domestic, governmental, non-profit, and 
railroad, as well as self-employment. Table 1 presents certain relevant 
data as to the extent of coverage under the system. 

BenefU Provisions and Definitions 

Monthly retirement benefits are available under the OASI program 
for “fully insured”^ workers age 66 and over, and for their wives 66 and 
over and their dependent children under 18. These payments are not 

* Presented at the 108th Annual Meeting of the American Statistical Association in devdand. Ohio, 
Decemher 27-29, 1948. 

1 Boughly, defined as having had covered employment of a significant amount ($50 or more) in at 
least half of the calendar quarters since 1936, or attainment of age 21 if later, and before attainment 
of age 66 or death. Emplosnnent before age 21 and after 65 is credited toward meeting the requirement 
even thoudi it does not serve to increase the requirement. In no case is there required more than 40 
such quarters. 


388 




BENEFICIARY STATISTICS 


389 


TABLE I 

COVERAGE OP THE OLD-AGE AND SURVIVORS INSURANCE PROGRAM 


Description of Coverage 


Number 

involved 


1. Persons in covered employment during average week in 
1947 


34 million 


2. Persons in covered employment some time or other dur¬ 

ing 1947 49 million 

3. Persons, at end of 1947, who were at some time in covered 

employment during 11 years of the program 77 million 

a. Number, in this category, who had earned sufSlcient 

wage credits to bo insured 43 million 

Average cumulative wage credits for this group $11,000 

b. Number with insufficient wage credits to be insured 34 million 

Average cumulative wage credits for this group SI ,300 


4. Equivalent amount of insurance in force under survivor 

provisions of the OASI system at end of 1947 S75 billion 


made, however, if the individual earns $16 or more per month in 
covered employment. As a result, many individuals do not file claim 
upon attainment of age 65, since they continue at work. On the other 
hand, some persons file a retirement claim but return to work, either 
permanently or sporadically. Beneficiaries who actually receive monthly 
payments arc said to be in "current payment status.” When there are 
added those who have filed but arc not currently receiving payment be¬ 
cause of covered employment, the total is termed ‘‘benefits in force.” 
At the end of 1948, there were 1.97 million persons age 65 and over who 
were fully insured, but only 63 per cent were in current payment 
status. Those who had filed a claim but wore not in current payment 
status represented 9 per cent, while the remaining 38 per cent never 
had filed; thus 62 per cent of the benefits were in force. 

Survivor benefits are available* to widows when they are age 65 or 


* All survivor benefits are payable if the deceased worker was "fully insured”; in addition, even 
though "fully” insured status was not present, monthly benefits are payable to orphans and widowed 
mothers wliere "currently insured” status (rou^y, covered employment of $50 or more in half of the 
quarters of the last 3 years) existed. 







390 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 


over, or when they have 1 or more dependent children under 18 in their 
care. Monthly benefits are available for children under 18 who are 
orphaned. In certain cases, there are monthly benefits for aged parents 
of deceased workers, if there was no surviving widow or child under 
18. These siuwivor benefits are subject to the same concepts of current 
payment status and benefits in force. 

Table 2 gives certain summary data on beneficiaries. It may be 


TABLE 2 

BBNBPICIABIBS OF THE OLD-AGE AND SURVIVORS INSURANCE PROGRAM 
IN CURRENT PAYMENT STATUS, DECEMBER 1948 


Category 

Number 
(in thousands) 

Average 

Monthly 

Benefit* 

A. Retirement 



Primary 

1048 

$25 

Wife (supplementary) 

321 

13 

Child (supplementary) 

25 

12 

B. Survivor 



Widow, old-age 

210 

21 

Widow, with children 

142 

21 

Child 

556 

13 

Parent 

12 

14 

Grand Total 

2314 

t 


* Rounded to the nearest dollar, 
t Not computed because not sigmficant. 


noted that the 2.3 million beneficiaries are receiving payments at an 
annual rate of about $550 million. It has been recommended by the 
Social Security Administration and the Advisory Council of the Senate 
Finance Committee (Senate Document 208,80th Congress, 2nd Session) 
that the benefit level be raised and additional beneficiary categories be 
added. 

Another concept involved is the distinction between “initial” and 
“subsequent” entitlements as represented in the new claim awards of a 
particular year. An “initial” entitlement is the first claim on any par¬ 
ticular case, such as a worker retiring, a wife’s benefit claim filed when 
the husband retires, or a widow’s benefit claim when she is over 65 or 
has dependent children when her husband dies. A “subsequent” en¬ 
titlement is a claim which would not have resulted in monthly payments 






BENBFICIABT STATISTICS 


391 


at the time of the insured’s death, as for instance a wife’s benefit filflim 
when she attains 65 after her husband retired or a widow’s benefit claim 
when she attains 65 after her husband’s death. 

In contrast with temporary suspension of benefits, as for instance 
because of covered employment, there are a number of reasons for per¬ 
manent termination of claims, such as death, remarriage of widows, 
and marriage or attainment of age 18 of child beneficiaries. 

Sources of Data 

Beneficiary data are published in a number of official publications of 
the Social Seciuity Administration. The Annual Report carries only 
general summary data, such as number and amount of benefits in 
current payment status and total payments certified, allocated by 
state and also given for the entire coimtry by category of beneficiary. 

The Social Security Yearbook presents a large volume of statistics 
on a calendar-year basis. The most basic beneficiary data are the num¬ 
bers and monthly amoimts of benefits awarded, in current payment 
status, and in force by beneficiary t3rpe for various periods (also carried 
forward monthly in the Social Security Bulletin). There are given data 
on benefits suspended and on benefits terminated, classified according 
to reason. Data for the current year are shown for new awards of the 
year, benefits in force, and benefits in current payment status by age, 
sex, and race, as to both number and amount. Although the data are 
summarized into fairly broad age groups (generally quinquennial) 
because of desired economy in publication, the figures are actually 
available by single years of age. This, in many instances, is necessary 
for proper analysis, especially for retired persons where there may be a 
great difference between those who retire promptly at age 65 and those 
who defer rctiicmcnt. Beneficiary data are also available to indicate 
in summary form the family composition of the beneficiary groups, 
both for new uwanls hi the year and for benefits in current payment 
status at the eml of the year. 

The aggix'gatc data on benefits are an actual count, whereas the 
other data giving details by age and family classification arc based on 
a 20 per cent sample. 

In addition to published periodical data, there is prepared each year 
a volume of substantive claims statistics which shows in considerable 
detail the claims awards (on a 20 per cent sample basis) by age of retired 
or deceased wage earner and type of benefits available. 

Once each year there is obtained a count of the number of benefi¬ 
ciaries actually receiving payments in each county in the United 



392 AMERICAN STATISTICAIj ASSOCIATION JOURNAL, SEPTEMBER 1948 

States. These data are subdivided only by type of benefit and not by 
age, sex, or other family characteiistic and arc used not only for pub¬ 
licity purposes, but also for contrast with the reeipient-load under the 
various public assistance programs. 

At irregular intervals, sample surveys on an interview basis are made 
to study various phases of beneficiary living conditions and resources. 
These surve 3 rs, although being based on the interview approach and 
subject to some limitations, are stratified samples of beneficiaries from 
the clftinis files. The results of these studies are published in the Social 
Security Bulletin, or in special releases. Also from time to time, there 
are issued various Analytical Notes which sometimes relate to special 
studies and tabulations dealing with different characteristics of bene¬ 
ficiaries. In addition, some of the Actuarial Studies, published intermit¬ 
tently, analyze beneficiary data and present future estimates thereof. 

AdaptabiUty of OASI Data for Demographic Studies 

There are many demographic studies which may be made from the 
basic data collected under the OASI program. Some of these raw data 
now exist in tabulations and in analyses which are under way. In other 
cases, different types of coding and tabulating are necessary. For 
instance, in the past, deaths or other events such as remarriage have 
not been allocated by the date of occurrence, but rather classification 
was by date of administrative action, either original filing or final award. 
Complications, too, arise because of the several concepts described 
previously, namely, in force status versus current payment status and 
initial entitlement versus subsequent entitlement. These various com¬ 
plications can probably be resolved or taken into account by appro¬ 
priate approximation or adjustment. 

The various basic data which are tabulated and analyzed arc nec¬ 
essary in the actual operation of the program. This is particularly the 
case in regard to actuarial cost estimates and other analyses for policy 
and planning, and also from the administrative standpoint in estimat¬ 
ing work loads for the near future. 

Mortality Studies 

Certain crude mortality studies have been made. For primary bene¬ 
ficiaries (retired workers), it has been found that as contrasted with 
general population e^rience, the mortality has been somewhat hig har 
for men (by roughly 5 to 15 per cent), but somewhat lower for women 
(by about 15 to 20 per cent). 

It appears likely that mortality for male primary beneficiaries is 



BENBFICtART STATISTICS 


393 


relatively unfavorable because with present high eniplo 3 nnent condi¬ 
tions, those who have actually retired, in most instances have done so 
only because of disability. This argument is further strengthened by the 
fact that those who retired promptly at age 65 and those who retired 
in 1940, the first year in which benefits were available, tended to have 
higher mortality than other retirants. 

The mortality for female primary beneficiaries, despite the factors 
previously mentioned as relating to male mortality, may have been 
lower than normal because women who had been working near age 65 
long enough to be insured might, on the whole, be superior lives. For 
both sexes, the death rates appear to have been independent of the size 
of benefits, which in turn reflects economic status. 

Considering other types of beneficiaries, studies indicate that both 
wife and widow beneficiaries, all of whom are age 65 or over, had rela¬ 
tively low mortality as contrasted with general female mortality, 
namely, by an amount of 10 to 15 per cent relatively. The mortality 
among child beneficiaries, the vast majority of whom are paternal 
orphans, is about 10 per cent higher than would be expected from gen¬ 
eral population experience. At the same time, the mortality of the 
widowed mother beneficiaries is quite close to the general population 
experience. 

A recent study made primarily to study remarri^ experience has 
indicated that for white widowed mothers, the mortality was perhaps 
5 per cent below that “expected,” but for nonwhite widowed mothers 
there was lower mortality by about 15 to 20 per cent. The relatively 
favorable nonwhite mortality may occur because those nonwhite 
workers who are regularly in covered employment 8ufS.ciently to be 
insured probably represent those of above average economic status, 
and therefore the health conditions of their family would tend to be 
above average. 

Unlike studies bsused on census data, the particular individual bene¬ 
ficiaries under OASI will be under observation over many years so that 
there is considerably more accuracy as to ago reporting, both as to the 
living and as to the deaths. Eventually, there will bo a large body of 
data which will yield information on that great “unknown” of mortal¬ 
ity investigations, namely, the true experience at the older ages (say, 
85 and over). 

Also, from the OASI data it will be possible to examine mortality 
by marital status. Certain Census studies in the past have shown that 
married persons have somewhat lower than average mortality and that 
widowed persons have relatively high mortality. Prelimmaiy studies 



394 AMSBICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

made from OASI data indicate that although married beneficiaries, 
both the retired workers and their spouses, have somewhat favorable 
mortality, there is no indication of very unfavorable mortaJity among 
widows, whether beyond age 65 or not. 

By having the same group of lives continuously under observation, 
it will be possible to construct “generation mortality tables,” which 
will give valuable indications as to the mortality of a specific group of 
individuals traced through from age 65, or slightly older, to eventual 
death. This contrasts with Census methods where it is only possible to 
study mortality by age for different groups of persons in a given year 
or short period of years; for instance, the experience at ages 80-84 wiU 
relate to different individuals than that at ages 65-69. The objection 
may here be raised as to amalgamating data which are not entirely 
homogeneous since they do not relate to a closed group of persons. 

Remarriage Studies 

The only available analyzed information on remarriage experience in 
the United States has been the American Remarriage Table which re¬ 
lates to a relatively small set of data arising from workmen’s compensa¬ 
tion claims under private insurance carriers.* Under the OASI system, 
there is a great body of experience ance the number of new widows 
created each year is close to 100,000. For a study of this type it is 
necessary to consider both age and duration of widowhood. 

One crude study that has been made in regard to the remarriage 
experience under the OASI system indicates that the actual remarriage 
rates are considerably higher than those of the American Eemarris^e 
Table, apparently by from 50 to 75 per cent. Several explanations 
seem likely. Since 1940 marriage rates have been quite high so that 
perhaps remairiage rates too have been high. Also the level of benefits 
under the OASI system is relatively low as compared to that under work¬ 
men’s compensation claims so that there is less financial deterrent to 
remarriage. 

At the lower end of the age scale, it is possible to obtain information 
as to mortality and marriage of children under 18 since these are rear 
sons for termination of child’s benefits. 

Other Studies 

If permanent and total disability benefits are added to the OASI 
program, as has been proposed, there will eventually be much data on 

* Roeber, W. F. and Marshall, R M , "An American Remarriage Table ” Proceedings of (he CawaUg 
Actoonol Socuiy, Vol. XIX. 




BENEFICIARY STATISTICS 


395 


disability incidence and tennination rates. This has been a subject 
on which there is very little material in the United States other t.hn.n 
the experience under life insurance policies and certain pension plans 
for governmental emplo 3 rees. Some students feel that neither of these 
sources of data nor the experience of foreign social insurance systems 
is indicative of the possible benefit experience which might arise for 
such a program in this country. 

The beneficiary data under the OASI system can also 3 neld informa¬ 
tion on family composition. Among the various data which could be 
obtained are the following: relative ages of husbands and wives, marital 
proportions, proportion of persons with children and numbers of such 
children, etc. In all instances, the analysis could be carried forward 
by age and race. 

In the preceding discussion, the various demographic studies have 
been discussed on a nation-wide basis since this is the nature of the 
OASI program. However, the analyses could be subdivided by regions 
or even by states, but then there would be considerable administrative 
difficulty because the records ai'e not set up to show existing residence 
combined with other demographic statistics. Likewise, any studies of 
migi-ation of beneficiaries, although theoretically possible, would not 
be administratively feasible. 

In addition, the analyses could in all instances be made by economic 
level since benefits payable are related to past covered wages; this 
would be especially valuable if coverage were extended to all employ¬ 
ment, but under present limited coverage the results would not be too 
s^nificant. 

Although it would be highly desirable to make mortality and other 
investigations according to occupation, a subject on which there is 
available all too little information, it would appear that this would 
not be too feasible from the OASI data. For one thing, the records 
which are obtained through the working lifetime do not indicate occu¬ 
pation, although they do show the industries in which the worker was 
employed. However, because of the dynamic character of the labor 
market, many individuals shift frequently from one type of industry 
to another so that even if the lifetime data were available for each 
worker at time of retirement, there would be considerable difficulty 
not only theoretically, but even more important administratively, in 
classif 3 dng the “normal” industry. 

As is apparent from the wide scope of the program, there is a vast 
amount of statistical data in the files which has not been tabulated or 
collected. The manpower limitations during the war years, of course, 



396 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

had an appreciable effect on studies that might desirably have been 
made. Since the war, there have been both manpower and budgetary 
limitations. 

As contrasted with the various mortality and other types of investi¬ 
gation that insurance companies do, the Social Security Administration 
has found it possible to do only a limited amount of actuarial and statis¬ 
tical investigations. In part, this might be explained by the fact that 
insurance companies must keep a close surveillance of their experience 
in order to be able to charge proper premium rates and to allocate 
dividends equitably. 

On the other hand, imder a governmental system, there are not such 
concepts of individual equity. Bather the benefit schedules are set by 
law and are payable accordingly, with only a general examination 
being required as to the immediate and long-range over-all financial 
balance. However, it is hoped that in the near future more extensive 
tabulation and collection of data will be possible so that some of the 
demographic analyses indicated previously may be made. 



BY-PRODUCT DATA AND FORECASTING IN 
UNEMPLOYMENT INSURANCE* 


Nathan Mobbison 

Division of Placement and Unemployment Insurance 
State of New York 

The unemployment insurance program has been operating 
in this country since the beginning of 1938. There have been 
many growing pains during these first eleven years. The 
operating problems involved in setting up this vast program 
have left little time thus far for careful consideration of the 
statistical data that are automatically produced in the course 
of its operations. 

These data, now and in the future, can be a fruitful source 
of information on changes in economic activities. The compre¬ 
hensive nature of these data, covering every city and county, 
giving information by detailed sub-industry, and yielding sta¬ 
tistics for each week and month, makes it possible to test 
various theories of the business cycle, and may, as a result of 
careful analysis, reveal interrelationships among economic 
variables that are now unsuspected. 

The data described in this paper are collected by all the 
state unemployment insurance agencies as part of their opera¬ 
tions. However, some states publish only the statistics which 
are needed for administrative and operating purposes. Un¬ 
doubtedly, more would be published if the existence of these 
valuable materials became generally known. 

BY-PRODUCT DATA 

W HAT DOBS the unemployment insurance system offer in the way of 
by-product data? Essentially, it provides detailed geographical 
and industrial data on employment and payrolls in firms covered by the 
state unemployment insurance laws, and on unemployed persons claim¬ 
ing benefits. 

The coverage of the unemployment insurance program is not yet 
complete. About 2 out of every 3 workers in the country are now in¬ 
cluded. In manufacturing, however, about 98 per cent are covered. The 
chief groups excluded are government employees, agricultural workers, 
self-employed persons, workers in some small firms (the state laws vary 
in this respect), employees of non-profit organizations, and domestic 


* Pk«sented at the 108th Annual Meeting of the Amerioan Statistical Association in Cleveland, 
Ohio, December 27,1048. 


397 




398 AMBBICAN STATISTICAL ASSOCIATION lOXJSNAL, SBFTBMBBB IMS 

workers. (Railroad workers have a separate unemplo 3 nnent insurance 
system.) Federal and state legislation to include practically all of these 
groups may be confidently expected within the next few years. 

For many industries and geographic areas, the present laws represent 
substantially complete coverage. In terms of long-range studies, work 
can be planned on the safe assumption that full coverage will be reached 
in the near future. 

A concrete example is perhaps the best way to indicate the data 
available and some of the ways these data can be used. 

For every city and coimty in New York State, tabulations are avail¬ 
able showing employment and payrolls by detailed industry group. The 
industry classification uses a 4-digit code that gives about 600 different 
sub-industry groups. The employment data show the number of work¬ 
ers during the mid-week of each month. The payrolls are on a quarterly 
basis. Since 1937, all employers with 4 or more employees (excluding 
the groups mentioned earlier) have been submitting such employment 
and payroll reports together with their tax contributions on a regular 
routine basis, as a normal activity of the unemployment insurance 
system. Actually, these reports contain more than the summary data 
on employment and pajrrolls. The name, social security number, and 
earnings of each individual worker are listed and submitted also each 
quarter. The reports are needed in order to collect the tax and to pro¬ 
vide the individual earnings data used to determine benefit rights when 
an unemployed person files a claim for unemployment insurance. 

These reports are mandatory under the law with severe penalties for 
failure to comply. The statistical data on emplo 3 mcnt and pa 3 rrolls are 
collected as a by-product of necessary operations and are thus obtained 
at a minimum cost. 

Another operating feature is the maintenance of lists, card files, and 
addressograph plate files, of all covered firms. Thus if a sample of firms 
is needed, the universe from which it is to be selected is readily avail¬ 
able. 

The usefulness of these data on establishments, employment and 
payrolls, by industry and geographic area, for market analysis, indus¬ 
try studies, labor market studies, and other related purposes is obvious. 
For study of changes in economic activity, Arthur F. Burns* has sug¬ 
gested the preparation of tables showing not only the total change in 
employment each month by industry, but also the number of firms 
increasii^ their work force and the number of firms dropping workers. 

1 Arthur F. Bnnis—"Twenty-sixth Annual Report of the National Bureau of Economic Research,” 
June 1946, p 22. 




tjnbmplotment insurance data 


399 


Another t 3 rpe of analysis that may throw some l^ht on economic 
changes is the study of the emplo 3 nnent changes in each industry by 
size of firm. Some studies in certain New York industries have indi¬ 
cated that the largest and smallest firms have more stable employment 
than the medium-sized firms. New York State has also analyzed em¬ 
ployment patterns in the millinery and dress industries in separate 
groups according to the price of the product, and in the canning indus¬ 
try by the crop being canned. 

The collection of these employment and payroll data has involved 
problems of industrial classification and other troublesome details that 
are to be encountered in any lai%e-scale undertaking. Such problems 
will always be with us. They need’not concern the analysts who will use 
these data. One particular problem may be mentioned. At present, all 
employment and payroll reports submitted to the state agencies, and 
the monthly sample of establishment reports which are handled in co¬ 
operation by the Bureau of Labor Statistics and the state {^encics, both 
cover the mid-week of the month (i.e., the week ending nearest the 15th 
of each month). The achievement of this uniformity is an obvious step, 
yet it required considerable effort and discussion. 

The unemployment insurance agencies also provide data on unem¬ 
ployment as a by-product of their operations. The 48 states and Alaska, 
Hawaii, and the District of Columbia all have individual laws with 
varying eligibility requirements for benefits. These minor differences 
have been given exaggerated importance by some persons who are not 
familiar with the practical situation. Essentially, the unemployment 
insurance laws define an unemployed person as anyone who is not 
working, is seeking work, and is willing and able to take a job if offered 
one. This definition is simple and unambiguous, and is in agreement 
with the average person’s idea of unemployment. An unemployed 
worker, in order to obtain insurance benefits, must report in person at 
the local imomployment insurance office nearest his residence, and must 
fill out forms and answer verbal questions which am designed to obtain 
information concerning his occupation, last employer, reason for loss of 
employment and his willingness and ability to take a job. This direct, 
personal contact with the unemployed worker is maintained, for he 
must rcpoil) to the office once each week as long as he continues to be 
unemployed and seeking benefits. 

Operating reports prepared weekly by each local office give the num¬ 
ber of persons reporting each week by tyx)e of claim, i.e., first claim and 
continued claim. Central offices, which usually prepare and mail out the 
benefit checks, provide as part of their operations, additional data on 



400 AMEBICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

the previous industrial attachment and earnings of the claimants. The 
existence of these operating data offers the opportunity for many types 
of studies of the characteristics of the unemployed and of the incidence 
of unemployment by industry, area, and occupation at negligible ex¬ 
pense. Two examples will illustrate some of these uses. Since 1940, New 
York has made a monthly sample study of benefit payments to deter¬ 
mine the industrial distribution of the claimants on a current basis. 
More recently, there has been an increase of about 25 per cent in the 
number of claimants during the last two months of 1948. A study has 
been made of all first claims received during these two months to de¬ 
termine the industries which were laying off people. 

The present incomplete coverage of the unemployment insurance 
laws is only of minor significance. In most of the cities in New York 
State, the persons now seeking benefits at the local insurance offices 
represent about 80 per cent of estimated total unemployment. Con¬ 
sidering the detailed information available by area and industry, and 
the personal, repeated contacts with this large majority of the unem¬ 
ployed group, it is clear that the unemployment insurance program has 
provided as a by-product, a major advance in the field of unemploy¬ 
ment statistics, far beyond anything available before 1935. 

New York State has been working on a long-range study which other 
states have also been urged to undertake by the Social Security Ad¬ 
ministration.* Beginning with 1937, individual eamii^s records are 
available for each year to date for every person employed in covered 
establishments. These records show quarterly earnings with each em¬ 
ployer. In addition, for those persons who were unemployed and filed 
claims for benefits, data are available for each year showing the number 
of weeks of compensable unemplo 3 rment and the amount of benefits 
received. For many administrative and legislative purposes, and in 
order to supply data requested by the Social Security Administration, 
an annual study has been made in New York to determine the number 
of benefit claimants by industry, duration of imemployment, and bene¬ 
fit rate. By taking as a sample the approximately 10 per cent group of 
persons whose Social Security account numbers end in the block 2,000 
to 2,999 (referred to as the “2,000 block”), and by entering the informa¬ 
tion on individual record cards which have space for 14 years of employ¬ 
ment and unemplo 3 rment data, it has been possible to obtain, as a 

* A oomxMtrable long-range study of employment data available under the Old Age Insurance 
Program is described in *The Conitnttovs Work History Samyle under Old Age and Suroivora lnawr~ 
anee* by Jacob Perlman and Benjamin Mandel, Social Security Bulletin, February 1944. New York's 
study has been developed in collaboration with the staff o! the Social Security Administration. 



UNEMPLOYMENT INSURANCE DATA 













































402 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

by-product of these required annual studies, a continuous record of the 
employment pattern of a large sample of New York workers over a 
period of many years. The detailed analysis of these materials has thus 
far been limited by recurring budget difficulties resulting from inade¬ 
quate financing of unemployment insurance administration by Con¬ 
gress. However, the data are produced each year as by-products of the 
operations of the unemployment insurance system, and are being pre¬ 
served for analysis whenever adequate staff becomes available. This 
long-range study of employment and unemployment by industry and 
area since 1937, which can be made in New York and the other states, 
should provide valuable data for both theoretical and practical work of 
students of the business cycle, business and market analysis, and 
economists and statisticians in general. (See figure 1.) 

These materials have also been used to a considerable extent in fore¬ 
casting work. This aspect can be touched on only briefliy in this paper. 
The forecasting is done for two major purposes. Unemployment in¬ 
surance benefits are paid out of a trust fund which in New York has 
totalled about $1 billion since 1945. The solvency of this trust fund 
is of course a major concern. Each year, all legislative proposals 
for changes in the benefit formula or in the taxing formula are analyzed 
and estimates of the effect on the solvency of the'fund are prepared 
under various assumptions as to the level and movement of employ¬ 
ment and payrolls during a 5 to 7 year period in the future. The effect 
of the end of the war was the subject of a comprehensive report which 
used all available materials on employment and unemployment by in¬ 
dustry and area as the background. The second major purpose is to 
estimate claim loads for budgetary and planning use. Administrative 
funds are provided by grants from Congress. A year in advance of the 
beginning of each fiscal year, estimates of unemployment insurance 
claims loads expected during the fiscal year must be submitted by each 
state. In addition, forecasts are prepared for each semi-annual period 
about four months before the beginning of each period; and finally 
monthly forecasts are prepared for each local office about two weeks in 
advance in order that local office staffs may be shifted to meet the load 
most efficiently. New York's experience in eleven years of operation 
may be of some interest and perhaps can be presented in detail in 
another paper. 

CONCEPTS OP EMPLOYMENT AND UNEMPLOYMENT 

Early in 1948, the International Labour Office issued a report on 
“Employment, tTnemplosnoaent, and Labour Force Statistics” which is 



T7NEUPL0TMENT INSTTBANCB DATA 403 

subtitled “A Study of Methods.” This report was prepared for the Sixth 
International Conference of Labour Statisticians held in Montreal in 
August 1947. 

This l.L.O. report comments in some detail on the methods being 
used in this country and in other nations to collect data on employment 
and unemplo 3 anent and suggests a broad approach which is eaapiTit,ial if 
the available statistical resources arc to be used efficiently. 

The collection of data on employment and unemployment in this 
country is being done for many purposes, and by many different J^encies. 
However, only the unemployment insurance system provides detailed 
geographical and industrial information on both employment and un¬ 
employment entirely as a by-product of its routine operations. It is for 
this practical reason that it is likely that with universal coverage, the 
unemplo 3 nment insurance program will be generally accepted as the 
primary source of this t 3 rpo of statistics. It shoiild be noted here that 
as far back as 1937 the Committee on Government Statistics and In¬ 
formation Services which was jointly sponsored by the American Sta¬ 
tistical Association and the Social Science Research Coimcil, stated in 
its report on “Government Statistics,”* “the present voluntary system 
of pa 3 n’oll and emplo]rment reporting may be rendered obsolete with 
the advent of unemplo 3 nnent compensation laws, which promise to 
provide comprehensive returns as a by-product of administration.” 

There will of course continue to be need for some of the other meth¬ 
ods now being used. Again the reason is a practical one. The problems 
that statisticians are concerned with in economic statistics in general 
and in studying changes in the level of employment in particular are 
complex. The problems will be solved, but as Professor Neyman re¬ 
marked in 1937 in discussing the analysis of economic time series, “it 
will not be done today or tomorrow.” It is important to recognize that 
all available tools and resources are needed if significant progress is to 
be nuidc in this field of research. In a paper published 21 years ago, 
Professor Harold Hotelling* pointed out that the Law of Gravitation 
could not have been established as it was by Newton if the sun were 
smaller and closer in size to some of the planets. If the sun were smaller, 
the family of planets would exhibit some of the complex movements of 
democratic societies and statistical methods not known in the 17th cen¬ 
tury would have been necessary to determine the Law of Gravitation. 

The problems of employment and unemployment are important to 


* *Govemniont Statistics”—^Bulletin 26, Social Science Research Council, April 1937, p. 92. 

* Harold Hotelling—^Differential Equations Subject to Error and Population Estimates,” Journal 
of the American Siaii^eal Aesoeiation. Sept. 1927, p. 287. 




404 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1049 

many groups and agencies in this country. The Joint Congressional 
Committee on the Economic Report in its recent report on “Current 
Gaps in Our Statistical Knowledge” refers to the need for extension of 
employment and unemployment statistics. The Committee states: 

"Such statistics are among the most important indicators of the economic 
situation. In this area, the Census Bureau's Current Population Survey 
is particularly valuable. The chief weakness of this monthly cnumcrativc 
survey is that it fails to show employment and unemployment on a geo¬ 
graphical basis, and that the sample is too small to provide reliable data 
on occupational and other characteristics of the unemployed. . . . The 
need for geographic detail can be met in part by annual surveys of indi¬ 
vidual metropolitan areas . .. Annual surveys of employment and unem¬ 
ployment for the major urban areas do not provide sufficiently current 
data in a period of rapid change. For this purpose primary emphasis must 
be placed on employment data reported by employers. State data of this 
sort are compiled by state employment security agencies and state labor 
departments in cooperation with the Bureau of Labor Statistics which also 
compiles the corresponding national series, but area series are maintained 
by only a few state agencies.” 

There is no reference at all to the unemployment insurance data 
available on claims filed by unemployed persons which every state 
agency collects as a by-product of its operations and which are actually 
reported to the Social Security Administration in Washington each 
month for each local office in every state. 

The reference to the employment data reported for each month by 
all covered employers fails to emphasize the important fact that the 
data are available as a by-product of operations, and that the addi¬ 
tional money needed for analysis and publication is but a tiny fraction 
of the cost of obtaining such detailed data on a current basis by any 
agency not conducting a vast program like the unemployment in¬ 
surance system. 

It is not possible in this paper to discuss the Census Bureau House¬ 
hold Sample in any detail. Obtaining information by visiting a sample 
of households has some advantages and some important defects. This 
tool of research can be a useful supplement, at present, to the unem¬ 
ployment insurance by-product statistics. When universal coverage is 
attained, there will be little need for a household sample, except per¬ 
haps for some special purposes. 

There is an important point concerning the definitions of employ¬ 
ment and unemployment. The Census Bureau uses a definition of un¬ 
employment which is not in accord with the concept used in the state 
laws described above. The definition suggested by the International 
Labour Office report is similar to the one used by the unemployment 



UNEMPLOYMENT INSUBANCE DATA 


405 


insurance laws, namely that “the unemployed should include all per¬ 
sons seeking work on a given day who are not employed but are able to 
take a job if offered one.” 

The I.L.O. report also comments on the Census Bureau insistence on 
including as unemployed only those who have been without work for at 
least the Monday through Saturday period in the enumeration week. 
The I.L.O. report estimates that unemployment was thus understated 
about 7 per cent during the period June 1941--May 1942. 

It would be a major step forward if the Census Bureau would bring 
its concepts and definitions more in line with the concept of unemploy¬ 
ment which underlies the laws of the 61 unemployment insurance 
agencies. Whatever technical difl&culties stand in the way, they must 
be overcome if the Census Bureau work in this field is to play its proper 
role as a supplement to the by-product data. 

The I.L.O. report has some pertinent comments on the use of un¬ 
employment insurance statistics as a measure of unemployment.The 
limitations it mentions are of some significance at present. However, 
with universal coverage and a probable extension of duration of bene¬ 
fits to 26 weeks in every state, these limitations will no longer be signifi¬ 
cant. 

One quotation from the I.L.O. report will serve to summarize some 
of the views expressed in this paper: 

"Social insurance statistics of unemployment are in a very real sense cost 
free, being by-products of the operation of a system installed for other 
than statistical purposes. It therefore becomes possible to expand statis¬ 
tical coverage and derive additional estimates (as well as to secure the 
basic estimates) at a remarkably low cost assignable to statistical pur¬ 
poses. Furthermore, the continuing contact of a social insurance system 
with individual workers makes it possible to conduct a variety of special 
studies on unemployment problems at low cost and with little inconveni¬ 
ence to the employee. The possibilities of using social insurance data for 
the study of unemployment problems have only begun to be explored.” 

The rich mine of statistical data available as a by-product of our 
unemployment insurance system if carefully analyzed may very well 
lead to significant progress in economic theory and to a better under¬ 
standing of the movements of the business cycle. Analysts and research 
workers in this field should examine the data already available, and 
should encourage the analysis and publication of data which are being 
collected in all the states, but which, in some states, may merely be 
stored away or filed for lack of funds and outside interest. 



STATISTICAL REQUIREMENTS FOR 
ECONOMIC MOBILIZATION 

Ralph J. Watkins 
Office of Plana and Progtama 
National Security Reaourcea Board 

I T IS IMPORTANT at the outset to make clear the distinction between 
mobilization and mobilization planning. In dictionary terms, mo¬ 
bilization is defined as the “act of assembling, equipping, and preparing 
military and naval forces for active hostilities.” It is scarcely necessary 
to say that the American nation is not engaged in mobilization; nor 
are we preparing for mobilization. Mobilization planning is quite an¬ 
other matter, as I shall try to show presently. 

The National Security Act of 1947 established the National Security 
Resources Board and assigned to that Board the statutory function of 
advising the president concerning the coordination of military, indus¬ 
trial, and civilian mobilization, including advice to the President on 
certain specific matters having to do with effective mobilization of 
resources in the event of vrar and with certain economic readiness 
measures against the contingency of war. Thus these functions make 
of the National Security Resources Board an economic mobilization 
planning agency set up to advise the President. It is important to note 
that the Board has no operating functions in the governmental sense 
of the term; its duty is to advise the President. 

Economic mobilization planning may be defined as the process of 
estimating the requirements or needs of war; of appraising the resources 
or means which would be available for meeting those needs; of measur¬ 
ing deficiencies revealed by the comparison of needs with means; and 
of determining the steps necessary to balance needs with means—all 
to the end that there may be available well-articulated plans for 
mobilizdng the resources of the nation in the event of war. It is im¬ 
portant to stress the words “in the event of war.” We are not engaged 
in planning “for war”; rather, we are planning against the contingency 
of war. 

The philosophy of mobilization planning, both military and eco¬ 
nomic, rests on the premise that a state of preparedness is one of the 
means of lessening the likelihood of an aggressive attack against the 
nation and at the same time one of the means of increasing the likeli¬ 
hood of winning a war, if the nation is forced into war. In the un- 


406 



ECONOMIC MOBILIZATION 


40? 


certain world in which we live, we can with prudence do no less than to 
take appropriate steps to improve our economic readiness position 
against the contingency of war and to lay plans for the rapid and 
effective mobilization of our resources in the event of war. 

It will be recognized at once that the definition of economic mobili¬ 
zation planning which I have outlined is one which in many fields— 
not all fields by any means—lends itself to translation in statistical 
terms. 

For example, one of the specific statutory fimctions assigned to the 
National Security Resources Board is to advise the President on “the 
relationship between potential supplies of, and potential requirements 
for, manpower, resources, and productive facilities in time of war.” 
We interpret that provision to mean that this advice must be given in 
time of peace against the contingency of war, as weU as in time of war. 
The basic and fundamental tool in a program of economic mobiliza¬ 
tion planning consists of detailed balance sheets of resources and re¬ 
quirements—above all for key raw materials but also for major end 
products, critical components, fuels, electric power, transportation, 
and manpower. 

In time of peace, such balance sheets provide the factual basis for 
advice to the President from time to time as to the steps that should be 
taken to strengthen our economic readiness position or to lessen our 
economic vulnerability against the contingency of war. In time of war, 
this balance sheet process would provide the factual basis for opera¬ 
tions by the war agencies. 

What are we doing to meet this fundamental need for resources— 
requirements balance sheets? We have under way three basic programs: 

1st. Quick and approximate estimates of requirements to test the 
economic feasibility of strategic plans. 

2nd. Balance sheets for Fiscal 1950 of key resources and security 
program requirements. 

3rd. Detailed estimates of resources and mobilization require¬ 
ments on a systematic and continuing basis. 

These three programs wiU be outlined briefly. 

Fiist— Quick and approximate estimates of mobilization requirements 
to test the economic feasibility of strategic plans. The Munitions Board 
in the National Military Establishment is now developing quick esti¬ 
mates—or flash estimates as they call them—of military requirements 
imder the assumptions outlined in the strategic plans. These will em¬ 
brace the needs of the Military Establishment for the key materials of 
steel, copper, aluminum, and petroleum; for construction; for many key 



408 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

items of military equipment; and for manpower. When completed, 
these estimates will be submitted to the National Security Resources 
Board. Concurrently with this work, the National Security Resources 
Board is securing comparable quick estimates of mobilization require¬ 
ments from the other security agencies—^the Atomic Energy Commis¬ 
sion and the United States Maritime Commission. In cooperation with 
the Departments of Commerce, Interior, State, and Labor, we are also 
developing quick statistical estimates of the mobilization requirements 
of the civilian economy. These last named civilian mobilization needs 
are being estimated initially on the basis of the civilian economy as it 
existed at the peak of our war effort in World War II, with, of course, 
appropriate adjustments for changes in population, changes in inven¬ 
tories, and changes in patterns of consumption. We will then add these 
estimates to those supplied by the Munitions Board to secure a measure 
of estimated total mobilization requirements. 

We will next compare these mobilization needs with the estimates of 
available resources, which are being prepared also in cooperation with 
the Departments of Commerce, Interior, State, and Labor. On the basis 
of that balancing of mobilization needs against means, we will be able 
to say to the National Military Establishment whether the strategic 
plans can be encompassed within the limits of our resources, and if not, 
why, and in what particulars. 

Second— Balance sheets for Fiscal 1950 of hey resources and security 
program requirements. At the request of the President, the Board has 
undertaken, with the cooperation of 21 federal departments and agen¬ 
cies, to appraise the resources available to the nation in relation to the 
requirements in Fiscal 1960 of the several national security programs— 
current and anticipated. That survey will include a wide range of bal¬ 
ance sheets covering the major strategic and critical materials, fuels, 
power, transportation, manpower, and the key manufacturing indus¬ 
tries of the nation. These balance sheets will serve as the basis for 
advice to the President on questions of national security policy. For 
example, they will indicate: (1) to what extent our physical resources 
may set limits to the national security programs; (2) to what extent 
these programs can be accomplished without controls; (3) to what 
extent controls would be required to assure completion of these pro¬ 
grams; or (4) to what extent controls would be required to protect the 
economy from price spirals in the raw material markets. Likewise, 
these balance sheets will afford a factual basis for economic readiness 
measures aimed at increasing our resources which the Board may see 
fit to recommend to the President. 

Third— Detailed estimates of resources and mobilization requirements 



ECONOMIC MOBILIZATION 


409 

on a systematic and continuing basis. Coming back to the assumption 
that the strategic plans meet the tests of economic feasibility which I 
have described: On the basis of the assumptions outlined in those plans, 
the National Security Resources Board must assemble from the several 
federal departments and agencies on a continuing and systematic basis 
estimates of key and major resources and key and major mobilization 
requirements, both military and civilian. The extent to which this sta¬ 
tistical planning process can be projected currently remains to be de¬ 
termined. In consequence, I can merely outline in general terms my 
own views. The Munitions Board, with the cooperation of the Service, 
will estimate detailed military mobilization requirements for submis¬ 
sion to the National Security Resources Board. The Board must look 
to the appropriate civilian deparments and agencies for the estimation 
of civilian mobilization requirements and likewise for estimates of 
available resources, as their continuing areas of responsibility in the 
task of economic mobilization planning. 

The staff of the National Security Resources Board is now engaged in 
the preparation of requirements manuals. These manuals will outline 
the necessary assumptions, methods, procedures, and forms. Out of this 
process could be developed balance sheets covering major and critical 
needs. For their review, an Interdepartmental Committee on Program 
Balance, with appropriate subcommittees, may be necessary for critical 
screening and review, in the manner followed by the Requirements 
Committee in the War Production Board during World War II. The 
objective would be to plan for a balanced program for military and 
civilian requirements, against which resource allocations could be 
planned. 

We realize, of course, that the problem is so great and so complex 
that we cannot hope, in time of peace, to do more than to design the 
machinery; to train a nucleus personnel within the National Security 
Resources Board and other agencies; and through peacetime operations 
produce, as the factual basis for mobilization planning decisions, bal¬ 
ance sheets covering only key and major end-products and components, 
and their translation into requirements for critical raw materials, fuel, 
power, transportation, and manpower. Through this continuing bal¬ 
ance sheet operation we would serve four basic mobilization planning 
purposes: 

1st. We would provide systematic measurement of the relation¬ 
ship between resources and mobilization requirements and 
thus provide a factual foundation for advice to the President 
on economic readiness measures. 

2nd, We would develop procedures and techniques which in the 



410 AMBBICAN STATISTICAL ASSOCIATION JOtTBNAL, SBFTEMBEB 1949 

event of war would be available for determining in detail the 
hundreds of resources-requirements balance sheets on which 
the war agencies could plan their operations. 

3rd. We would develop procedures and interdepartmental ad¬ 
ministrative machinery for analyzing and integrating these 
balance sheets to achieve program balance. 

4th. We would train a skeleton staff within NSRB and in other 
agencies of government which could quickly be e^qianded to 
conduct this vital operation. 

It may properly be asked where the statistical records can be found 
for the economic mobilization planning which 1 have described. The 
general answer must be that in time of peace these must come from 
existing records and established channels for the statistical reporting 
of economic intelligence. Special mention, however, should be made of 
the valuable statistical materials to be found in the reports and records 
of the War Production Board and other war agencies. Of particular 
value are the records on bills of materials for translating end-product 
and component requirements into their raw material content. It may 
be added also that the Armed Services are adding to our bills-of-ma- 
terials records through special provisions in procurement contracts. 

Let us now consider briefly the nature of statistical requirements in 
time of war. Our experience in two world wars has shown clearly that 
effective prosecution of a major war is not possible without resort to 
numerous economic controls. National policy then demands that the 
sum total of business operations accomplish specific and well-defined 
objectives. To meet these objectives, government is forced into day-to- 
day relations with business units. Government administrators, as a 
result, must collect data showing a balanced factual picture of the 
economy. Business management requires a continuing flow of accurate 
and detailed information to make the kinds of decisions needed to con¬ 
form to government regulations and to effect the maximum contribu¬ 
tion of the concern’s resources to the national objectives. 

Statistics are needed to devise these war-time controls and to serve 
in their enforcement. To determine the nature and the extent of controls 
of manpower, for example, we need detailed statistical information on 
geographic distribution of employment, on the availability or non¬ 
availability of specific skills for specific industries in specific areas. 
Likewise, industrial controls cannot be established and effectively en¬ 
forced without detailed information on needs, on usage, and on inven¬ 
tories. This information must be far more detailed and specific than 
that ordinarily available to the federal government. This information 



BCOKTOMIC MOBILIZATION 


411 


must give us facts not only on the quantities of general types of com¬ 
modities and products but on their varieties and shapes. It must give 
accurate indication of the physical location of inventories, their owner¬ 
ship, and their intended use. Similar detailed information is needed on 
capacities, production plans, and schedules, on storage and transporta¬ 
tion, and on a multiplicity of other activities and needs. The collection 
of such information must be centrally planned. Its analysis and inter¬ 
pretation must be centrally coordinated. 

Once initial controls have been established, detailed statistics are 
required as tools for the daUy tasks of administering those controls. It 
is then that government finds itself in business on a large scale, buying 
and selling enormous quantities of materials and equipment. Conduct¬ 
ing business on such a scale calls for managerial statistics winch are 
intimately related to the accounting records from which such statistical 
information is derived. This type of statistics is in daily use by the 
management of private enterprises, but in time of peace the collection 
of such statistics is generally beyond the province of the federal gov¬ 
ernment. In time of war, however, the federal government directs a 
lai^e share of the economic activities of the nation, closely controls 
many of them, and must inject itself in many day-to-day transactions 
between sellers and buyers. Statistics required by the government to 
perform these functions naturally are not peace-time voluntary reports 
but the mandatory collections of a mass of detailed facts the reporting 
of which is justified by the emergency of war. 

Perhaps the simplest way of indicating the magnitude of the problem 
of war-time control statistics is to note that the Catalog of War Prodtio- 
tion Board Rejiorting and Application Forms required nine laige-sized 
volumes aggregating more than 3,000 pages merely to reproduce the 
most useful 1,200 of the 4,400 questionnaires developed and used by 
that agency; and an additional volume to list alphabetically the 5,000- 
odd commodities reported on in these 1,200 forms. 

As stated in the General Introduction to that Catalog: “To mobilize 
our production resources for war, to insure that the products of mines, 
forests, factories, and chemical laboratories were utilized efficiently, to 
integrate materials and components into the greatest possible volume 
of finished products and to channel the distribution of finished products 
to the military and export agencies and to the domestic civilian popula¬ 
tion in the manner most directly related to winning the war, the WPB 
assembled a quantity of information of unprecedented magnitude and 
detail. The statistics collected included data on basic materials, semi- 
fabricated materials, components, subassemblies, and finished end 



412 AMEBICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

items. They also provided related information on the purpose for which 
the products were used and the kinds of ultimate consumers who used 
them,” 

It would, of course, be dangerous to assume that the statistical ac¬ 
tivities of the emergency agencies in World War II represented models 
of efficiency. Quite to the contrary, students of the problem can cite 
numerous examples of bad planning of statistical reporting and the 
collection of voluminous facts which sometimes could not even be 
tabulated fast enough to serve as guides to administrative action in a 
rapidly changing situation. It is, therefore, incumbent on statisticians 
engaged in mobilization planning to look with a critical eye on the 
statistical reporting of World War II; to seek earnestly to learn from 
that experience through selection of the good patterns and discarding of 
the wasteful and unnecessarily complicated patterns; and to strive for 
simplicity in designing the statistical controls for a possible future 
emergency. 



THE WAR PRODUCTION BOARD'S STATISTICAL 
REPORTING EXPERIENCE, V AND VI 

David Novick 
University of Puerto Rico 

AND 

George A. Steiner 
University of Illinois 

This is the last of a series of four articles, in six parts, con¬ 
cerned with the statistical reporting experience of the WPB 
and its predecessor agencies. In this article there is presented 
an analysis of a few of the outstanding tabulation problems of 
the WPB and its principal tabulating agent, the Bureau of the 
Census. In addition, the series is concluded with a statement 
of a few outstanding lessons which the WPB’s statistical ex¬ 
perience has revealed to be strategic in the establishment of a 
suitable emergency industrial statistical reporting program. 

PART V 

TABULATING DATA 
introduction 

T he success with which the WPB was able to acquire accurate data 
and to compile them quickly into meaningful summaries deter¬ 
mined in a very real way the effectiveness of basic policy decisions and 
administrative control actions. The tabulation of statistics should be 
an elementary process. Under wartime pressures and data demands, 
however, there were hidden in this apparently simple process many 
dangers. The greater the volume of data to be tabulated and the closer 
it related to operations the greater these dangers became. Improper 
arrangement of even the most minor details in final summaries many 
times invalidated an entire tabulation and the details from which it 
was compiled. Such failures produced a lack of confidence in a series, 
making the figures ineffectual for the purposes designed. 

Principles and procedures used to aggregate statistical series into 
summaries arc basically the same for both great masses of data and 
figures in small volume. But the WPB came to learn that problems in¬ 
cident to massive statistical compilations, particularly when they were 
to be used for administrative rather than purely statistical purposes, 
were of real significance to its effective operation. The main thesis of 
this article is to study a few outstanding tabulation problems facing 


413 



414 AMBBICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1940 

the WPB in its attempt to aggregate masses of data for managerial 
functions, to show how they were solved, and to note the impact on 
WPB management. 

Wartime Corrvplication of the Tabulating Process 

The wartime tabulation process was complicated by three elements: 
(1) pressures of time, (2) the volume and complexity of numbers de¬ 
manding addition, and (3) special wartime tabulation needs which 
required adaptation and often important modification of peacetime 
mass tabulation techniques. These forces became the more difficult to 
control because of severe limitations of time and of large-scale tabula¬ 
tion facilities and experience. 

Wartime operations pressed on the time cycle of statistical processing 
from every direction. The result was an acceleration of all steps in 
tabulation procedure. The greedy demands of wartime operations w'ere 
never satisfied and all energies were constantly directed towards effect¬ 
ing a further tightening of the tabulation time schedule. 

The avalanche of statistics collected by the WPB severely strained 
available resources for aggregating them and gave rise to a series of 
problems. Mechanical and personnel resources w'ere at times strained 
to the breaking point. 

Wartime operational needs injected new problems for both industiy 
and government into peacetime tabulation techniques. For industry 
they necessitated the establishment of new records, finer detail in 
records, an imremitting flow of information, and quick filing of reports. 
The result naturally was reporting inconsistency, incomplotion of forms, 
failure to use the specified measures and outright clerical errors of 
serious proportions. Large-scale tabulation methods faced the prol)- 
lems of quickly unifying this extraordinary mass of divcnac information 
and a^regating it in a way often different from that specified on the 
basis of past experience. The penetrating eye of war sought much more 
detail in summaries and greater accuracy than w'as required under 
ordinary circumstances. These were not the result of unreasonable 
expectations of the novice; they sprang from the urgencies of im¬ 
mediate administrative action which probed deeply into the heart of 
industrial production and distribution. 

Two fundamental results grew out of these considerations. First, 
the care bom of leisure could not be devoted to the methods and 
devices by which basic data requests were formulated and answered. 
Second, time was not available to permit the careful planning of all 
aspects of the mass tabulation procedure. Even so, many of the new 



8TATISTI08 IN THB VTPB 


416 


problems were matters which could not be resolved by analsrsis of 
peacetime experience but had to be answered by trial-and-error 
methods. The net result was the development of one more strain on the 
entire administrative process which in turn multiplied and hardened 
still further the compulsions of time. 

The Prodiiction Requirements Plan Tahdatims^ 

In a very real sense, form PD-25A, the operating instrument for the 
PRP, embodied most of the more serious tabulation problems en¬ 
countered by the WPB and its most important tabulating agent, the 
Bureau of the Census. For this reason it is informative to study in 
detail the process by which form PD-25A was tabulated, especially 
the nature and technique of its product coding. For this purpose we 
shall choose the tabulation for the fourth quarter 1942 PD-25A upon 
which were reported second quarter 1942 actual metal use and fourth 
quarter 1942 critical material requirements. Although this may be 
considered typical it should be mentioned that various changes in the 
form itself; manufacturer and WPB experience in processing; and 
changes in the variety of final tabulation requirements—all resulted 
in slightly different tabulation techniques and problems for each 
PD-25A quarterly tabulation. 

The first step in the tabulation of PD-25A was sorting schedules to 
facilitate coding, review, and tabulation. The most important sorting 
was for size. In view of the fact that it was impossible to tabulate 
within the time available all PD-26A schedules received (about 
35,000), smaller plant schedules were sorted out and marked for ex¬ 
clusion from tabulation. If a company reported less than 30 tons of 
total steel or 10,000 pounds of all other items together its schedule 
was rejected for purpose of tabulation. The reason for this treatment 
grew out of studies made regarding the extent of metal concentration 
among manufacturers and the fact that it would have been impossible 
to meet a tight time schedule if tlic tabulation included the thousands 
of such small cases. On the basis of PD-275 returns it was calculated 
that the small cases rejected and the thousands of small plants not 
required to file PD-25A, consumed no more than five per cent of the 
metal used by all plants meeting PD-25A reporting standards, as de¬ 
fined in priorities Regulation No. 11. The small cases which were re¬ 
jected consumed less than one-half of one per cent of the total material 
used by all industry. 


1 For a summary of the operation of the Pkoduciion Bequiremente Plan, see the first aitide in this 
series. Volume 43. Number 242. June 1048. p. 220. 




416 AMEBICAN STATISTICAL ASSOCIATION JOUBNAL, SEFTEMBESB 1949 

The next step in tabulation related to product coding of the schedules 
to be included in the tabulation. It was in this area that the major 
problems arose in PD-25A tabulations and their use. 

The purpose of a product code is to assemble products which are the 
same, similar, or closely related, into fairly homogeneous groups by 
assigning the same number to all items included within the same group. 
These numerical symbols identify the product groups throughout the 
process of tabulation thereby insuring final summarization for com¬ 
parable products. In the process of summarization, figures representing 
material consumption by producers of edge tools, for example, are 
meaningful only if it is reasonably certain that the figures include all 
reports for producers of certain specified edge tools. Uniformity of con¬ 
cept of product groups becomes indispensable when condensed reports 
by commodity groups are required without listing individual items. 
Under these conditions the choice is a simple one: uniform product 
classification or completely useless tabulation. 

Various government agencies, particularly the Bureau of the Census, 
for many years had devoted considerable attention to the proper clas¬ 
sification of industrial data. But this work, as discussed in Article III, 
Part IV, was not suitable for wartime use and as a result new classificar- 
tions had to be developed. The fruition of this work, it was previously 
pointed out, resulted in the product codes used to tabulate PD-275 and 
PD-25A. Under the code worked out for tabulating these forms, it was 
possible to classify production data with the specific product to which 
they related and to provide summary information which the WPB 
could use for product administratipn. 

The heart of the difficult coding problem of the PD-25A tabulation 
lay in the desire of WPB Divisions to obtain as fine product details as 
possible. The basic limitation on product group expansion, however, 
lay in the extent to which manufacturers themselves reported detailed 
data by individual products. PD-26A applications were developed by a 
manufacturer upon the smallest operational unit which his experience 
led him to establish. For the most part these units reflected common 
experience in terms of fabricating facilities, labor problems, processes, 
materials used, or products made. As a result, there existed in each 
industry a substantial group of closely related inventory control units, 
even though the ultimate use of the output may have been as dissimilar 
as vitreous enamel kitchen utensils and vitreous enamel hospital 
utensils or Army and Navy mess equipment. Summarization of PD- 
25A data for policy and administration was limited by the extent to 
which current industry records could relate material use, inventory. 



STATISTICS IN THE WPB 


417 


consumption, and requirements to detailed products. Efforts to depart 
veiy far from the reports themselves could easUy invalidate the final 
tabulations. 

Hindsight leads to the conclusion, however, that a substantial educar 
tional program explaining the needs of the PRP and PD-25A might 
have provided the needed inspiration for manufacturers to report finer 
product details. The PD-25A, covering requirements for the first 
quarter 1943, contained such a statement of the product detail desired. 
The results x)ermit the conclusion that had such information been 
given industry from the time of the first PD-275 the end result might 
have been far finer product classes than were finally developed. 

Each PRP schedule contained three product codes: the minor, the 
secondary, and the major product code. Each code was based on 230 
product groups (later expanded to 440) and was determined by the 
description of the products reported in Section B of the PD-25A ap¬ 
plication, shown in the accompan 3 dng illustration. The minor code 
represented the finest product detail reported and made possible a 
tabulation of dollar shipments for each product, the summarization of 
which was completely related to a homogeneous class of products. The 
secondary product code, the most significant of all, represented the 
most detailed breakdown of material data in relation to product 
groups. It was this code, into which the vital data for WPB material 
distribution operations were tabulated, that set the patterns of oveivaU 
metal authorization for groups of products, and it was this code which 
created most of the controversy surrounding PD-25A tabulation. Its 
exact composition, therefore, deserves further comment. 

Product shipment information was reported in Section A of PD-25A. 
Total detailed metal consumption, inventory, and requirements for all 
products produced from materials included in the smallest inventory 
unit of the plant were reported in Section E, Part 1, and material 
usage and requirements for specific products were presented in Section 
E, Part 2. 

If a manufacturer reported in Section E, Part 2, information for each 
product specified in Section A, the minor and the secondary codes 
would have been the same. But if, as was more common, a manufac¬ 
turer reported fewer products in Section E, Part 2, than in Section A, 
the two would be different. This happened because manufacturers 
reported product details in Section B which could not be related to 
materials in Section E, Part 2. A producer of circuit breakers, for 
example, might report shipments for a variety of types in Section B but 
lump main types in Section E, part 2, simply because materials used 



418 AMBBICAN STATISTICAL ASSOCIATION JOTTBNAL, SLPTEMBEB 1948 



te criM te Ir Ah* smHmi fai pwriHs P0-3fA 
ISMtnfM. 


Hfi 1 «e lO-Ogilkwlh W«db8kHl 











































STATISTICS IN THE WPB 


419 



CamoHraDN M tiqomkHDin m SmutR Ct um Of rnooiim 























































420 AMEBICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER IM9 

for detailed types came from the same iavenlory bin. The problem 
may be explained more fully in this way. Suppose the Ajax Corporation 
reported in Section B, as follows: 


Products 

Dollar 

Shipments 

Jigs 

$10,000 

Dies 

10,000 

Other machine tool accessories and parts 

10,000 

Electric conduit metal clamps 

2,000 


Each of the above items would be given a minor code. The first three, 
being items of the same genus, would carry one secondary code, and 
the fourth item another secondary code. Now, suppose Ajax either did 
not provide a Section E, Part 2 for any of these items, or combined the 
fourth item with one of the first three. Either of these alternatives 
would have raised a serious coding problem. The following details 
should have been reported for clean-cut tabulation: (a) use and re¬ 
quirements for dies, j^s, and other machine tool accessories and parts, 
and (b) use and requirements for electric conduit metal clamps. 

In a situation such as this, several alternatives were open. The metal 
data for all products, including that for electric conduit clamps, could 
have been thrown into a product group the composition of which was 
solely devoted to information about machine tool accessories, since it 
was clear that the major products of the company fell into this category. 
For a small plant this would have had little importance. If the Ajax 
Corporation, however, had been larger than it is shown to be, a decision 
to do this would have seriously affected the accuracy of the over-all 
tabulations. It would have impaired the accuracy not only of the 
product group designed for machine tool accessories (which would have 
been loaded with other irrelevant data) but also of the product group 
concerned with electric conduit metal clamps (which would not have 
reflected the true volume of materials used and required for this 
product). 

In view of this situation, of course, every effort had to be bent toward 
segregating information for both groups. This sort of problem was 
solved on the basis of information submitted by the plant on preceding 
questionnaires; from the knowledge of the Bureau of Census technical 
experts; from WPB Division personnel having cognizance over the 
plant; or from direct correspondence with the company. Frequently, 



STATISTICS IN THE WPB 421 

the schedule itself furnished information by which good estimates of 
the true situation were calculated. 

Finally, a major code was given to each schedule to indicate the 
major product class of the plant or principal product as determined by 
dollar shipments. This code was necessary to tabulate receipts, use and 
inventory information for which no secondary product group tabula* 
tion was possible. The form did not provide for such a distinction of 
inventories. This decision in formulating the questionnaire, of course, 
was dictated by industrial practices and the tremendous difficulties 
which efforts to do otherwise would have imposed upon industry. 

Other codes were placed upon each PD-25A schedule, such as a geo¬ 
graphic code to permit tabulation by States and WPB Regions; a WPB 
Industry Division code, to show which section of the WPB would 
process the form and to provide it with a tabulation of concern to it 
alone; and various other codes designed for technical purposes, such as 
a completion code, a size code, etc. All of this coding procedure was 
dictated by the complexity of the final tabulations required. 

Review and editing of PD-25A constituted a large and important 
operation. The principal functions of editing were to review informa¬ 
tion reported, to detect errors and correct them, and to prepare the 
schedule for machine tabulation. The report was surveyed to make sure 
that it was in usable form, was accurate and consistent with instruc¬ 
tions, and would not disturb the over-all accuracy of the final tabula¬ 
tions. It is clear that tabulation of imedited or unreviewed schedules 
would have resulted in completely meaningless tabulations. 

There were many instances where data reported on the PD-26A 
needed adjustment, but only a few of the less technical points need be 
treated here. Shipments were sometimes reported only in units rather 
than units and dollars; sometimes shipments for a future quarter were 
reported as smaller than those for a past quarter yet anticipated metal 
requirements for the advanced quarter were substantially beyond con¬ 
sumption in the past quarter; shipments reported on schedules of 
subsidiaries sometimes covered the entire company of which it was a 
part, while metal figures related only to the subsidiary; and so on. 

Greater problems were encountered in connection with the metal 
figures. Sometimes manufacturers would report tons where pounds 
were required, and vice versa; frequently forgings and castings were 
reported in units, pipe in feet, and wire in rolls. Clerical errors on tiie 
part of plant personnel resulted in omission of important requirements 
and placing information on the wrong line or in the wrong column; and 
often odd material sizes and shapes were written in the preprinted stub. 



422 AMEBICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1940 

Oilier typical and more difficult errors to correct were as follows: sub¬ 
contractors, contrary to instructions, often reported material that they 
did not own but processed for a prime contractor; in some instances a 
manufacturer making basic metal shapes and forms specified on the 
preprinted stub, forgings for example, reported metal consumption 
and requirements for his finished basic product; manufacturers often 
reported one thing on PD-25A and in attached correspondence de¬ 
scribed conditions which necessitated eliminating the figures; and 
many times material reiiorted in Section E did not appear in Section E, 
Part 2, when production clearly indicated the material had to be used 
to make the product specified. Obviously these errors had to be cor¬ 
rected for tabulation machines added integers only and produced 
figures no better than those which went into them. 

In all tabulations it is important to know the exact part of the 
universe which the questionnaire is presumed to cover. Two problems 
in this connection presented themselves in tabulating PD-25A applica¬ 
tions. First, it was necessary under the operation of the Production 
Bequirements Plan to consider total requirements for critical materials 
and to compare them with total estimated supply. Since the PD-25A 
applications themselves were not used as application forms in certain 
non-PRP or “exempt” areas the WPB had to make arrangements to 
acquire “master PD-25A” schedules covering material use, require¬ 
ments, and inventory for consumers in these areas. This was a WPB 
rather than a Bureau of the Census Problem. Second, m the absence of 
a long experience with the PD-25A and its predecessor PD-275, and 
because of changing war production and industry experience in fifing 
this type of report, it was exceedingly difficult for the Bureau of the 
Census to know when sufficient reports were received to assure reason¬ 
ably complete covert^e in particular industries or for all industries. 
Tight tabulation time schedules necessitated stopping the tabulation 
procedure somewhere so that schedules could be added and presented 
to the WPB for its administration. But tabulations incomplete in 
totality or in important particulars could not be adapted readily to 
administrative needs. 

Obviously one of the first important steps in assuring completeness 
was effective follow-up procedures for stimulating quick industry 
response. But beyond that the Bureau of the Census and the WPB had 
to develop measures of completeness for the entire tabulation and im¬ 
portant product groui>s thereof. The most important and practical 
measure was a fist of important plants that had to be included in the 
tabulations to assure completeness. The high d^ree of material con- 



STATISTICS IN THE WPB 


423 


centration among a relatively few large plants facilitated the deter¬ 
mination of an estimate of the extent of coverage of the total tabulation 
merely by checking the important plants included. Within narrow 
product groups this technique, although applicable, was not so practi¬ 
cal. And no matter how carefully incoming schedules were matched 
against lists of important consuming plants, estimates of coverage in 
the aggregate or for particular products were subject to a wide margin 
of error and only the preliminary tabulations could reveal accurately 
the extent of completeness. This problem, due essentially to lack of 
historical series for this type of data, created an unnecessary risk in 
both tabulation and WPB operations. 

Following the above processes, schedules were forwarded to the 
machine tabulation unit where punch cards were prepared. One card 
was pimched for each line of Section B, one card for each item in 
Section E, Part 1, and one card for each item in Section E, Part 2. In 
addition, there was punched on each card the major product code, the 
secondary product code, the minor product code, the WPB Division to 
which the schedule was assigned, a size code, a completeness code, and 
a serial number for the schedule. Altogether approximately 500,000 
cards were punched for the fourth quarter, 1942, PD-25A tabulations. 
In the next succeeding quarter the volume of cards almost reached 
800,000. Preliminary tabulations were prepared from the cards and sub¬ 
jected to critical review. In this process, the figures were given addi¬ 
tional careful scrutiny to detect errors appearing when the data for 
one homogenous group were listed together. Literally miles of tabulat¬ 
ing machine listings were produced from these cards. 

The foregoing account leaves little doubt that tabulating PD-26A 
schedules not only was a monumental undertaking but was made in the 
face of serious technical problems. The time schedule for tabulation 
frames the problems of tabulation mvUum in parvo. For the fourth 
quarter 1942 the time schedule was as follows: receipt of schedules, 25 
October-lO November; first machine listing, 11 November; machine 
listing for the WPB Bequirements Committee, 16 November-23 
November. Here was what amounted to a census of the metal-working 
trades, completed in less than one month I 

Despite immense effort to adapt tabulation procedures to fit 
administrative methodology, the PD-25A tabulations and those of its 
predecessor PD-275, raised loud criticism in the WPB. Censure sprung 
from many factors, perhaps the most serious of which was the coding 
system used by the Bureau of the Census. The system of coding was 
necessarily rooted in a classification system which reflected existing 



424 AMEBICAN STATISTICAL ASSOCIATION JOIHCNAL, 8EPTEMBEB 1949 

industrial oi^anization. The development of the WPB Division struc¬ 
ture, on the other hand, was in terms of product assignments which in 
many instances did not reflect the real organization of American in¬ 
dustry. 

These divergent factors created a fundamental conflict between the 
tabulated summaries and the WPB’s ability to use them effectively. 
The net effect was a basic policy decision on the basis of tabulated 
summaries which could not be carried out fully by the WPB Divisions. 
The Census, for example, was obliged to tabulate insulated wire and 
cable into one category partly to obtain a clear-cut homogeneous group 
of data and partly to permit tabulation of individual plant schedules 
reporting production of insulated wire and cable, without reporting de¬ 
tailed data for all types and sizes. But in the WPB various types of 
insulated wire and cable were distributed and managed by different 
divisions with conflicts in the jurisdiction and responsibilities of each. 
The Copper Division was assigned various types of copper cable; the 
Communications Division was concerned with coaxial, lead-covered, 
telephone, and submarine cable; the Shipbuilding Division was in¬ 
terested in armored or degaussing cable; and the Building Materials 
Division processed requests for armored cable, electric cable, and in¬ 
sulated cable. Other products, such as electric motors, valves, and 
condensers were each under the cognizance of many WPB Divisions. 
Some important products such as storage batteries; turbines and water 
wheels; bolts, nuts, washers and rivets; and miscellaneous stamped and 
pressed metal products, were completely unassigned to any WPB 
Division. No juggling of tabulations could adjust to this illogical as¬ 
signment of products within the WPB itself. 

There were two further problems. The WPB Divisions, partly to fit 
tabulations into their organization and partly to fit them into their 
preconceived notions of the type of data summaries needed to provide 
efGlcient administration, felt that tabulated summaries should be made 
in much greater detail than was possible with the 230 product groups 
used in tabulation. A simple illustration of the problem relates to one 
product group established for "plumbing fixture fittings and trim.” Into 
this group the Census tabulated products including victory trim which 
contained a small amount of copper alloy; brass trim which was 100 
per cent copper alloy; plumbing specialities which might have been 
either; pipe hangers, supports, and rests, which were largely carbon 
steel; and other fixtures of varying tjnpes of metal content. The WPB 
problem centered about the policy determination of allocations of this 
group and Division and industry compliance with basic policy. With 



STATISTICS IN THE WPB 


425 


such a farrago it was difficult for the Requirements Committee to know 
how much carbon steel included in the allocations for this group would 
be consumed in victory trim as against pipe hangers. Likewise the 
responsible Division had no method for determining, without special 
inquiry, what proportion of the metals allocated to ihis product code 
would be in victory trim (which was available to the public), or brass 
trim (which was available only to the military services). The results 
led to WPB pressures for finer and finer product detail. The Census 
problem in accomplishing finer product detail, of course, centered 
about limitations inhei'ent in manufacturer reports. In the face of 
limitations imposed by the reports themselves the effort to produce 
more detailed summaries intensified the entire tabulation problem and 
subjected it more and more to error. In later quarters the product 
groups were refined but never to the complete satisfaction of the 
WPB Divisions. 

* A second type of problem pertained to the Census policy of attempt¬ 
ing to make product group data homogeneous. But this policy, partic¬ 
ularly in the first quarter of 1943 tabulation, met head on with increas¬ 
ing difficulties in the WPB in relating the chain of production to final 
products. The tabulating agent was asked, therefore, to code products 
into their end use categories whenever possible. This was done lor the 
first time in the first quarter of 1943 PD-25A tabulation. Thus, for 
example, brass valves for ships, where identifiable as such, were set up 
in a separate product group under ships; valves for track-laying trac¬ 
tors were tabulated with track-laying tractors; low pressure hydraulic 
valves for aircraft went into aircraft; boiler feed regulator valves for 
locomotives were tabulated in locomotives, and so on. Once again, the 
principal limitation was in the plant reports themselves. Since a com¬ 
plete end-use identification was not possible for all components the 
resulting final summaries were rather far from homogeneous and did 
not completely reilect incorporation of component product data into 
product groups established for their respective end items. 

All these difficulties wore accentuated because of simple errors in the 
schedules themselves. Since the codii^ experts in the Bureau of the 
Census were few in number and since time pressures were so great, 
much of the coding had to be performed by clerks. Although errors were 
for the most part relatively minor, they assumed tremendous propor¬ 
tions in WPB debate. Natural unfamiliarity with a wide variety of 
trade names, many of which were introduced with war production, led 
to miscoding. Some homographic words were miscoded, such as con¬ 
certina barbed wire under musical instruments and fire control instru- 



426 AMBBICAN STATISTICAL ASSOCIATION JOURNAL, SBPTBMBBB 1049 

ments under fire-fighting equipment. Such errors were magnified out of 
all proportion to their significance by WPB personnel and eventually 
led to rather widespread dissatisfaction with PD-25A and PRP. 

Decentralized Tabulation of Operating Data under the PRP and the CMP 

The PRP experience lent a deeper and stronger support to those 
WPB material distributive systems which followed it than most people 
are willing to admit. The CMP, for example, would not have been 
possible without the experience gained under the PRP. It is in the realm 
of tabulation that one of the supports for this conclusion is clearly re¬ 
vealed. 

The PRP, as a mechanism for material distribution, brought to light 
many important “bugs,” the spotting of which gradually filtered up 
through the various managerial levels of the WPB and on the way 
crystallized into rather clear-cut issues, which could be resolved only 
by top-side policy decisions. The more important questions related to 
the precise material shapes and forms to be controlled by a master 
distributive machine, the exact oi^anization of summary data for 
policy decisions and administration, and the methods by which data 
were to be accumulated on both the requirements and authorization 
side of the management chain of action. These issues were resolved in 
the promulgation of the Controlled Materials Plan. 

Conclusions of the planners and administrators of basic policy coin¬ 
cided in the fourth quarter of 1942 concerning the need for and ad¬ 
vantages of decentralized tabulation of authorizations made by the 
WPB Divisions on PD-25A and other material allotment instruments. 
It was recognized that the WPB could no more administer its program 
for metal distribution without some accounting of action than a private 
business enterprise could carry on its day-to-day operations without 
some accounting for its cash withdrawals. In addition, it was recog¬ 
nized that decentralization of the authorization tabulations within each 
WPB Division offered many administrative advantages over central¬ 
ized accounting control tabulations. 

These concepts developed into a number of basic operating tenets. 
The foUowii^ were of considerable importance: (1) those responsible 
for administration should also be made responsible for the figures 
which formed the basis of control action; (2) decentralized control 
records should be available for uses other than control accounting; (3) 
specialized problems relating to particular industries, plants, and 
products could be best ironed out on a local basis; (4) disagreements 
over methods of tabulation and accuracy of results d&appeared under 



STATISTICS IN THE WPB 


427 


decentrally posted but centrally directed tabulation; and (6) require¬ 
ments data and authorization actions could be readily related through 
all levels of administration for each detailed segment of industry when 
data were tabulated decentrally. Undoubtedly, centralized tabulation 
had certain advantages over decentralized tabulation but admin¬ 
istratively the latter technique was far superior to the former. 

To test the statistical validity of these tenets both the Bureau of 
the Census and the WPB Divisions tabulated fourth quarter 1942 
PD-25A returns. Substantial and ridiculously careless errors were 
detected in the WPB Division tabulations when the two series of data 
were matched. Nevertheless the experience was well worth the effort 
and firmly established both the utility and feasibility of WPB Division 
decentralized tabulation of data relating to the operation of a compre¬ 
hensive metal distribution program such as the PRP. 

In the meantime, plans were under way for the introduction of the 
Controlled Materials Plan in the second quarter of 1943. Tabulation 
techniques contemplated in the operation of the CMP fortunately 
were guided by the previous history of the PRP and other tabulation 
experiences. First, a firm decision was made concerning the metal 
shapes and forms to be included on the stub of the CMP application 
forms. Second, the WPB firmly grappled with the problem of product 
codes into which Class "B’’ products were to be incorporated. And they 
were fitted into the WPB's operations. This stilled much of the WPB 
criticism of Bureau of the Census product codes used in the tabulation 
of PD-26A. Third, the problem of proper editing of forms before 
tabulation was recognized and acknowledged. Finally, the WPB 
Divisions themselves were to make both the original requirements 
tabulations and the tabulation of actions taken on the basis of the 
WPB Requirements Committee decisions. 

The problem of determining product groupings was divided into two 
parts, reflecting the combination of vertical and horizontal material 
distribution techniques upon which the Plan was based. On the vertical 
side, the Claimant Agencies determined their own groupings in terms 
of final products, or programs. The WPB problem with respect to the 
Class “B” list was comparable with that encountered under the PRP 
but of less formidable dimensions. Here classification had to be in 
terms of fabricated products entering into end products. To grapple 
with this problem the WPB established a Product Assignments Com¬ 
mittee which attempted to reconcile the differences existing among the 
various Divisions, to establish firmly individual Division responsibility 
over an individual product, and to place responsibility for groups of 



428 AMEBICAN STATISTICAL ASSOCIATION JOITBNAL, 8EFTBMBKB 1949 

related products in one Division. Experience under the PRP went a 
long way toward defining properly the individual items to be included 
in a particular Class “B” code and in insuring that items calculated to 
be related were really homogeneous and reportable by industry in the 
terms specified. But despite this experience, there were some difficulties 
with the listing. There was some overlapping of responsibility among 
the Divisions, some product groups were too refined for good iadustry 
reporting and effective WPB processmg, some product groups were 
too narrow and not representative of a homogeneous group of products, 
some products were completely overlooked, and some product groups 
were far too inclusive. Altogether, however, the difficulties raised in 
the original listing were not of substantial proportions and operations 
over a few quarters rather well ironed out the problems. 

The classification scheme established in February 1943 for Class “B” 
products embraced 484 product groups as compared with 441 used 
imder the PRP in the first quarter of 1943. This represented a sub¬ 
stantial expansion because it will be remembered that the Class “B” 
listing was confined to presumably shelf items and did not include the 
industrial universe as did the PRP. A great volume of end products 
such as gims, tanks, ships, etc., were considered Class “A” products and 
taken care of in the programs established by Claimant Agencies. It 
is interesting to note that some PRP product groups were combined to 
form one CMP Class “B” group. Such instances, however, were few. 

The history of product group classification up to this point was one 
of general expansion, modification, consolidation, and adjustment. 
This process was so widespread that any attempt to determine sum¬ 
mary data covering the war period from the various instruments used 
by the WPB was greatly complicated by variations in product group¬ 
ings, not to mention coverage, type of data obtained, and overlapping 
time periods. Thus, tabulation of combinations of reports was virtually 
impossible. The introduction of the CMP classification systems made 
data obtained from the series of comprehensive reports preceding it 
virtually useless in the operation of the CMP. 

Preceding experience with authorization instruments demonstrated 
beyond doubt that to obtain good and accurate tabulations of require¬ 
ments and to furnish the basis for accurate accounting it was necessaiy 
to subject incoming forms to some critical review. During the second 
quarter of 1943, the first quarter of the CMP operations, the Bureau of 
the Census tabulated CMP-4B requirements for the WPB. But in the 
third quarter of 1943, when the WPB Divisions first undertook that 
responsibility, the manual of instructions to the Divisions contained a 



STATISTICS IN THB WPB 


429 


careful analysis of the types of adjustment necessary to provide uni¬ 
formity in tabulation and program determinations. Recognition of and 
planning for inaccurate reports spelled the difference between usable 
and unusable WPB summaries. 

Fundamentally, the problem of the WPB Division tabulation of 
controlled material requirements and authorizations against those 
requirements was simple. It was merely an additive function. The basic 
reporting requirements for specific metals and metals shapes and forms 
were well established; the problem of specifying end-use on the forms 
had been resolved by means of the extension of program symbols; the 
product classifications into which the data were to be summarized and 
assignments of the product groups to specific WPB Divisions had been 
accomplished with success; and finally, industry had ample opportunity 
to adjust its records to provide the sort of data needed on various 
application instruments. There was little else to do but add. 

The WPB Industry Divisions were not prepared to tab ula te material 
requirements on CMP-4B applications submitted for the second 
quarter of 1943, and the Bureau of the Census was asked to make 
summaries for the WPB. To facilitate operations the WPB received 
the applications, coded and edited them, and transmitted them to the 
Bureau of the Census for aggregation. Because of the preceding ex¬ 
perience with the PRP, however, the WPB Divisions were quite able 
to maintain authorization accounts for the second quarter of 1943, 
and all subsequent quarters. In the third quarter of 1943, both the 
WPB Industry Divisions and the Bureau of the Census tabulated 
incoming applications, the dual tabulation constituting insurance 
against WPB Industry Division failure. As it turned out, a series of 
audits among the WPB Industry Divisions showed that tabulation 
was proceeding accurately and satisfactorily. As a result the Bureau 
of tlic Census tabulation was not completed. In the fourth quarter of 
1943, WPB Industry Divisions assumed all CMP requirements and 
authorization tabulations with which they were concerned, except 
those pertaining to mill reports of shipments. These were tabulated by 
the Bureau of the Census. 

Under the operation of the Controlled Materials Plan during the 
fourth quarter of 1943, which procedure continued throughout the war 
period with but minor modifications, the WPB Industry Divisions 
were responsible for a number of tabulations. The most important 
related to a summarization of material requirements reported on in¬ 
coming applications, and accounts showing authorization made against 
them. The Industry Divisions tabulated data on incoming CMP-4B 



430 AHEBIOAN STATISTICAL ASSOCIATION JOURNAL, SBFTENCBBB 1949 


^ORM GA-:9B URITFO STATES OF AMERtCA 

FORMERIT IC-3T VM PRODUCIIOM WOOD 

SWHAItr OF CNMtB ArPllCATIONS FOB ALL^nCIIT OF 


CONTROLLED MATERIALS FOR 4TH QilARTER mt 


RCVICVCO «RO APFROVfO BT 



secTicit A • VALUE OF SNIPHENTS ANALYZED 8Y PREFERENCE RATINQS AND CLAIMANT AGENCIES 


□QDEizro 



SNIPMENTS I ESTIHATEB SNIFmENTS I CSTIMATEO SHinENTS 
APRIL « iIVKE m3 I 4UIT • SEPT. Xf43 I ” OEC. m3 


ii'WitfcfiTia 

■niHDIQZIHiniKZE^^ 


PREFCRCMCE RATING 

AAA-AAl 


I kZISHIMBSI 



*cLAfifAArr Miner net TAtvLAm WAtATur • mat at tnuttit ttrt or net aho tmittnttNtt, 


B-4BT0 



















































































STATISTICS IN THE WPB 


431 


applications for -which they were responsible and summarized them for 
presentation to the General Statistics Staff of the WPB on form GA- 
298, sho-wn in the accompanying illustration. These summaries were 
then duplicated, aggregated, and presented to the WPB’s Vice- 
Chairman for Operations. 

Some relatively minor complications in this simple additive process 
grew out of time pressures, the need for including large late respondents, 
and complicated summaries required from the Industry Divisions. 
There was a general tendency to increase the number and complexity 
of the t 3 T)es of information summarized by the WPB Industry Dm- 
sions. Tabulations for the third quarter of 1944, for example, were 
presented on forms GA-689 (Requirements) and GA-689 (Explanation). 
The format of these two forms presents eloquent testimony to the 
increasing diversity and complexity of the tabular requirements. But 
as the CMP operations became smoother the need for such complicated 
presentations vanished and in 1945 the tendency towards simpler re¬ 
ports again prevailed. The third quarter of 1946 CMP-4B requirements 
tabulations (made in June 1945), for example, reverted back to form 
GA-289. Although these were the basic summary presentations the 
responsibility still existed for the Divisions to provide a variety of 
other compilations. 

In the operation of the CMP, as in the PRP, time schedules were 
always tight. In the fourth quarter of 1943, completed tabulations of 
almost 33,000 CMP-4B applications were made 55 days folio-wing the 
final date for filing the applications. Subsequent quarterly operations 
shaved this time cycle considerably. 

WPB Field Tdbvdalions 

The real impetus to decentralized tabulations in WPB Field Offices 
originated with the broad decentralization policy of the WPB in 1943. 
Transfer of more and more operating functions to field offices at that 
lime would have left a gap, had it not been filled, in the flow of signifi¬ 
cant information to officials responsible for the establishment of these 
policies. With expansion of field office actions, therefore, it became 
essential that they be chained with the responsibility for maintaining 
uniform reports on actions, consistent with and related to comparable 
records maintained in the WPB Industry Di-risions. 

Experience mth decentralized tabulation, both in -the WPB and its 
field offices, was satisfactory enough to set it apart as being feasible 
and practicable. Indeed, it may be concluded that sound wartime 
administration is better achieved when tabulation of data closely 



432 AMBBICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER lOW 

related to administration is mode decentrally by those who are ro- 
sponsible for the use of the data in management. This conclusion has 
great importance for mobilization plans of the future which must 
utilize decentralized administrative methodology. 

PART VI 

LESSONS OF EXPERIENCE 

The task of the War Production Board and its predecessor agencies 
can be stated simply. It was to man^e the industrial economy of the 
United States in such a way as to produce the maximum war and 
war-supporting goods and services with the minimum dispatch. 

This was predominantly an administrative job. To an extent un¬ 
precedented in our history the national economy was guided by govern¬ 
ment between 1942 and 1945. During the war the construction of every 
new industrial plant was authorized by government. Not a pound of 
steel or copper or aluminum could legally be fabricated and used with¬ 
out government approval. In all industrial production and distribution 
the decisions concerning what should be produced, who should produce 
it, and to whom it could be shipped were directed by government. 

Facts and proficient factual collection methods were indispensable 
to the intelligent and efficient performance of governmental manage¬ 
ment. Actions taken in industiy were recorded on millions, yes billions, 
of pieces of paper. It became necessary for the WPB to ^gregate in¬ 
formation from these documents to provide the means for making 
informed policy decisions, to provide the instruments for administering 
the thousands of actions intertwined in a single policy decision, and to 
develop records of the actions taken on the basis of policy decisions. 

Looking back on the beginniags in 1940 and 1941, and even the later 
developments of 1942 and 1943, two facts stand out in connection with 
the emergency data-collcction system. First, many of those responsible 
for industrial mobilization and continuous control of the economy did 
not have a clear concept of the vital place which systematized factual 
reporting occupied in the emergency management function of govern¬ 
ment. Second, methodological discipline required to systematize the 
collection and use of facts needed for emergency management was not 
developed and ready for use. The result was the parallel evolution of a 
recognition of the need for factual accumulation to support manage¬ 
ment and the development of suitable techniques to get the necessary 
facts. The process of meshing the two was slow and costly. 

Two classes of lessons emerge for those who in the future must deal 



STATISTICS IN THE WPB 


433 


with the problems of emei^eixcy oi^anization and direction of the 
nation’s economy. The first group relates to the role played by an in¬ 
dustrial statistical reporting system in emergency management. The 
second group relates to the technical methodology, and the principles 
which should be used in its application, in developing and presenting 
the facts needed for formulating emei^ency policy and administering 
that policy. 

Before presenting these lessons it is worthwhile to draw attention to 
the applicability of World War H reporting experience to emeigency 
problems of the future. It is our contention that no matter what the 
circumstances may be in a future emergency, wise management is im¬ 
possible without facts; that it is possible to develop a system for col¬ 
lecting and using facts which is adaptable to government management 
of industry in all potential emergencies; and that the experience of 
World War II establishes the outlines and basic character of that 
system. 

The Role of an Industrial Reporting System in Emergency Management 

In the desperate search for means with which to supply the materials 
of war, we learned that effective decisions could be made only on a 
basis of knowledge. In the development of early control policies and 
techniques of administration it was assumed that statements of policy, 
supplemented by a few administrative reports, would provide the 
framework upon which industry could and would automatically mobi¬ 
lize for war production. It soon became apparent that the pull of the 
regular customer always diverted a substantial part of industrial out¬ 
put from war production unless statements of control policy were 
based upon imderstandable, complete, fool-proof facts and administra¬ 
tive techniques. Mere statements of policy, we learned, would not 
provide the guns, planes and tanks when manufacturers who supplied 
the goods of war were at the same time engaged in making automobiles, 
refrigerators, cosmetics and other items which offered them profitable 
use of materials, facilities and labor. We also learned that control over 
industry, once it converted to war production, could not be achieved 
by broad policy statements unsupported by concrete reporting to and 
from government. No firm control was possible without a method for 
collecting and using facts as instruments of both policy and administrar 
tion. 

The experience of the WPB clearly reveals the character of the in¬ 
dustrial factual collection system which emeigency management re¬ 
quires. Stated simply, it is necessary to have a complete and detailed 



434 AMBKICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

knowledge of the nation’s resources and the demands which war will 
make upon them. This knowledge must be available on a comprehen¬ 
sive scale, reveal the most minute details, be in common terms by means 
of which the entire range of facts can be related to one another and to 
aggregates, and cover specific time frequencies. It means developing a 
national resources-requirements budget. But the factual collection 
system must accomplish more than that. Not only must it furnish the 
information essential for policy decisions but it must also provide the 
means through which mandates for specific action are transmitted to 
industry in specific terms. At every stage of responsibility this flow of 
authority must provide a basis for precise and definite accountability 
and reports of administrative action. These are the fundamentals of 
an adequate industrial reporting system suitable for, and flexible 
enough to support, emergency management. 

Since the underlying problems of industrial mobilization are largely 
those of translating available men, materials, and machines into the 
greatest possible quantity of goods required for military operations 
and war-supporting activities, there must be a complete and penetrat¬ 
ing knowledge of all resources. Information about resources requires a 
reporting system which will provide in integrated detail the actual and 
potential production of mines, forests, farms, smelters, refineries, fac¬ 
tories, power plants, railroads, public utilities, and other productive 
facilities. The reporting system must provide knowledge of the smallest 
as well as the largest economic unit. The system should cover facilities 
for producing various types of materials, the actual level of operations, 
unfilled orders, potential capacity, labor requirements, material and 
component requirements, and so on. The entire fabric of industrial 
resources and their use must be revealed clearly in totality and in 
detail. The details must tie together in a factual picture, given in 
common units of measure and in definite time periods. 

Against this knowledge of resources must be placed a statement of 
the demand for them in such terms that the two may be related. The 
demand for resources must be built upon a detailed, realistic and 
balanced statement of military, essential civilian war-supporting, and 
export program requirements. This statement must be in terms of 
specific items needed directly by the armed forces and civilians; the 
materials, components and facilities required for their production; and 
the time periods in which both final products and the intermediate 
products needed for their production are scheduled. In short, total 
requirements must be known and classified into procurement progams 
and production details. This factual statement must be realistic, ac- 



STATISTICS IN THE WPB 435 

curate, and in common units of measure which can be related to re¬ 
sources. 

To determine the extent to which certain types of war activities 
could be supported, the WPB found that it required knowledge of the 
demands of a specific activity under consideration and information on 
all of the related supporting and dependent activities. Thus, in ap¬ 
praising the size of an ordnance program for Oerlikon guns, it was 
necessary to determine not only the quantities of nickel-bearing steel 
that would be used in the shafts of the guns but also the quantities of 
nickel-bearing steel that were required for the bearings of the gun, the 
shells that would be fired from the guns, the axles of the railroad cars 
that would haul the finished guns and the material used in their pro¬ 
duction. To appraise the one requirement for nickel-bearing steel all 
other requirements had to be evaluated, from the nickel-bearing steel 
used in the finished guns to the nickel-bearing steel required to make 
replacement bearings for a machine which mixed malted milks at a 
canteen in a war plant. 

This sort of informational need grew out of the problems and circum¬ 
stances of war. When the economic machine was straining at optimum 
capacity, when shortages appeared in many areas, wise direction of 
the machine was possible only when government on the basis of facts 
could choose among several demands focused at one point in the 
production process. If the supply of nickel-bearing steel was less than 
the total demand for it how else could an intelligent decision be made 
to double the quantity of steel entering into Oerlikon gun parts and 
reduce the amount of steel used as replacement parts for malted milk 
machines? Effective decisions required a factual review of total supply 
of and all demands for the steel. 

It should not be assumed that once the size of the expanded war 
program was projected and the economic system was approaching 
capacity operations, the need for this interrelated information became 
readily apparent. On the contrary, realization of the need for facts and 
the magnitude of the problem of getting such facts was gained only 
after long and costly experience. 

In tlie early defense period we tried to maintain all peacetime activi¬ 
ties and at the same time satisfy the demands of war. Only when we 
realized how perilous was our existence and how great was our need 
for weapons did we have the courage to stop the production of a few 
unessential items. Later, in mid-1942, we belatedly recognized the fact 
that actual war output was too small and that we could not in fact 
have both guns and butter. Attitudes were rapidly switched by the 



436 AMEBICAN STATISTICAL ASSOCIATION JOURNAL, SBFTBMBBiB 19« 

winter of 1942 and the military services were demanding programs 
which, if attempted, would not only have precluded butter but would 
have so strained the economy as to almost stop production of guns too. 

In all of these actions we were driven continuously between desire 
and reality. The cord that joined the two was at all times knowledge. 
But the precise knowledge required did not exist when most needed. 
It was mifortunate that in the beginning we could no more foresee our 
needs for information than we could foresee our needs for armament. 
We tried to satisfy everyone in the early defense period by a free ap¬ 
portionment of materials and facilities and by the avoidance of a laige- 
scale reporting burden on industry. 

Ultimately we learned that we had to have facts and developed 
the outlines of the industrial reporting S 3 rstem required for emeigen- 
cies. We learned that the system should produce a complete budget 
of all vital resources and all demands to be made upon them. The 
data must be complete and precise, from the supplies of basic resources 
to the procurement programs for final military, export, and essential- 
civilian products. With the data at hand it must be possible to adjust 
and balance programs among themselves and in the aggregate and to 
relate them to resources in such a way that objectives established are 
feasible of attainment. The data must show the time sequences in 
which various quantities of limited raw materials are needed at all 
levels of fabrication from basic material processes to final products. 
Throughout, the data must tie horizontally and vertically, must be 
expressed in uniform terms and must be related to uniform periods of 
time. Beyond all this, the methodology of collecting these data must 
be such that accurate results can be acquired in a matter of days. The 
more serious the emergency the more urgent becomes tlic demand for 
this sort of integrated reporting. 

The more integrated and comprehensive the reporting sti-ucture the 
■wiser will be basic policy and the more effective will be the administra¬ 
tion of that policy. With an integrated reporting system it is pos¬ 
sible to determine whether specific policy objectives or slightly modified 
ones can be achieved. It is possible to translate policy decisions into 
actions to be taken by large numbers of people at different admin¬ 
istrative echelons. It is possible to convey in concrete terms to every 
person required to act both the specific authority and responsibility 
which each has and the precise limits imposed on the exercise of au¬ 
thority. An integrated statistical system should provide the methods 
and procedures by means of which information essential for policy 
decisions can be obtained, by which mandates for specific sections can 



STATISTICS IN THE WPB 


437 


be transmitted through administrative levels, and by which precise 
accountability for the administration of each policy can be achieved. 

It must be realized that an integrated industrial reporting system is 
not the sole basis for efficient mobilization and industrial control by 
government in emergencies. Time and again the WPB found that its 
internal organization had to be geared to the methods of the reporting 
system if the data upon which the WPB depended were to be used to 
the greatest advantage. Time and again the WPB had to adjust basic 
control procedures to fit the realities of sound data-collection methods. 
This does not mean that statistical methodology dictated either or¬ 
ganization or control policy. Rather it means, as the WPB discovered, 
that since all three had to be related as closely as possible to the 
fundamental methods and procedures of industrial operations, all 
three had to be coordinated for the most effective operation of each. 

Finally, as the WPB also discovered, the best statistical system will 
fail to reach its potential if there is not a free flow of information 
among all levels of management and if management at any important 
level fails to admit the importance of accurate factual knowledge in 
making decisions and administering them. An emergency statistical 
reporting system, even though it has the characteristics noted here as 
being important, will be no better than the degree of coordination of 
other elements of management and the intelligence with which it is 
operated and with which its data are used. 

Applying the Technical Methodology of a Reporting System 

In striving to produce the statistical discipline which it came to 
recognize as the basis of its operations, the WPB met with every im¬ 
portant data-collection problem imaginable. In the solution of these 
problems it experimented with almost every known technique. Se¬ 
lected technical lessons which the war period taught are the substance 
of the preceding articles. Since the discussion is a concentrated con¬ 
densation of a vast experience, it is not desirable to attempt further 
summarization at this point. A better purpose can be served by 
attempting to set forth a few of the more important principles which 
experience has taught should be kept in mind when applying the highly 
technical methodology of industrial factual accumulation in emer¬ 
gencies. 

Experience shows that it is both necessary and possible to create a 
coordinated emergency industrial reporting discipline. An emergency 
industrial control agency is in a real sense an operating holding com¬ 
pany over the productive machine of the nation. Such an agency 



438 AMERICAN STATISTICAL ASSOCIATION JOURNAIj, SEPTEMBER 1949 

must be in a position to collect accurately and quickly at one central 
point or points summaries and details of the millions of actions which 
are taken in the highly complex production machine. The WPB's 
experience shows that it is possible to draw the factual threads of in¬ 
dustrial action together at a central point or points in such a manner 
that tlie data are well coordinated, comprehensive in character yet 
penetrating in detail, accurate, and quickly tabulablc for prompt use. 
If methodology is well planned, carefully applied and used with in¬ 
telligence it can be made flexible enough to provide data to meet all 
important problems. The WPB never created such a statistical dis¬ 
cipline but its experience leaves no doubt about both the need for it 
and the ability of government and industry to create it. 

The development of a coordinated, flexible and useful reporting sys¬ 
tem is impossible in an emergency without some centralized control 
of the issuance of questionnaires and the application of technical 
methodology. In time of crises eveiy pressure on administrators is 
toward immediate action. The drive for facts is so great that any 
promising method is attempted. The results of unbridled issuance of 
questionnaires in an emergency cannot be loss than an unjustified 
burden on industry and chaotic government administration. By cen¬ 
tralizing control over questionnaire issuance a focal point is provided 
for gathering together all sorts of questions relating to questionnaire 
policy and methodology. An opportunity is provided for such ques¬ 
tions to be resolved in the light of the best policy and methodology. 
If this function is managed with wisdom the questionnaire system and 
its techniques may develop into and operate as a balanced fact-gath¬ 
ering mechanism. If it is not, the best plans and intentions will not 
create a coordinated data-collection system. 

To approach a coordinated statistical methodology the many tech¬ 
nical principles composing it and inherent in its satisfactory main¬ 
tenance and operation must bo well-formulated and dearly set forth. 
A balanced reporting system represents a highly complex structure in 
which details are of great importance. Not only may injudicious in¬ 
structions, or poorly-phrased questions leave management without 
facts for crucial decision, but inattention to minute household details 
relating to such matters as mailing lists, follow-up procedure, editing, 
coding, and machine tabulation may also invalidate an important 
survey. And, paradoxically, the more complicated in substance and 
comprehensive in coverage the questionnaire the more important 
become these details to valid results. Questionnaires of small magnitude 
may be hand-processed to avoid administrative problems in govern- 



STATISTICS IN THE WPB 


439 


ment. Comprehensive questionnaires cannot be so pampered. Slight 
error or miscalculation in any technical detail involved in their life 
cycle may become so magnified that the final results may be either 
suspected or ignored by management. The existence of a set of care¬ 
fully considered technical standards which may be uniformly applied 
if the will and machinery exist to do it is a primary requisite to sta¬ 
tistical discipline. 

One of the most important groups of statistical standards which 
must be created and applied with the greatest skill is that relating to 
technical nomenclature and statistical units of measure. Not only are 
definitions and classifications of products, materials, plants, etc., cru¬ 
cial in the good use of data but in their determination there are present 
many problems which are administratively complex. Units of measure 
are, of course, basic in the functioning of a reporting system. Their 
determination and application also present exceedingly diflicult and 
complicated problems. Statistics are a method of communication. They 
are, in addition, instmments of policy and administration. They are 
useless to serve these basic functions unless they are founded on a clear, 
understandable and administratively usable language of definition, 
classification and units of measure. 

The WPB found that it is not important nor always desirable to 
embrace the complete universe of a subject matter in order to produce 
meaningful data. Concentration in industry of metal consumption, 
component usage, manpower requirements, and other production ele¬ 
ments, is such that facts necessaiy to exercise control can be acquired 
from much less than the totality of imits in a given area. For almost 
every product and material not more than 15 per cent of the total pos¬ 
sible plant respondents account for not less than 85 per cent of the 
total economic activity for that product or material. By limiting 
reporting to these large units, and by exercising control in smaller 
units by means of general regulations, the objectives of emergency 
industrial control can be attained without imposing serious record¬ 
keeping burdens on the bulk of the small enterprises and serious clerical 
problems on government. 

Reporting metliodology must be geared to industrial practices if 
prompt and accurate data are to be acquired. The relationship between 
the design of centralized control procedures and methods of industrial 
operation must be carefully considered when developing administra¬ 
tive techniques. Experience teaches that any control procedure which 
accommodates itself to industrial practices can be instituted with a 
minimum of operational friction. Every new control system must go 



440 AMBBICAN STATISTICAL ASSOCIATION JOXIENAL, SBPTBMBBB 1049 

through a ■warm-up period to familiarize iudustry with the new rou¬ 
tines and to permit necessary adjustments 'to them. To the extent that 
operational changes are held to a minimum, this initiation period is 
shortened. In addition, any control system which is based upon indus¬ 
try’s existing record-keeping practices is bound to yield better and moie 
complete results. The closer the reporting framework and the controls 
which it supports approximate industrial practices, the quicker and 
easier it is to attain the objectives established for controls. 

Better results are obtained if reporting is on a quid “pro quo basis. 
Despite the obvious load which some complicated questionnaires, such 
as allocation and scheduling forms, imposed on industry, particularly 
on those plants whose record-keeping systems w'ere not designed to 
yidd readily the required data, reports were received quicker and with 
more accuracy than most experienced statisticians thought possible. 
On the other hand, many comparatively simple questionnaires de¬ 
signed to acquire general information for broad policy analysis were 
often met in industry with intense and bitter opposition. Response was 
slow and replies incomplete. The basic reason for this paradox is sim¬ 
ple: the former eventually resulted in a right for the respondent to do 
something; the latter produced no direct advantages to the respondent. 
The obvious lesson to be drawn from this experience is that the more 
closely statistical data requirements are tied to application-type re¬ 
porting forms the better the data ■will be both ■with respect to accuracy 
and completeness and the quicker returns will be made by respondents. 
In addition, industrial criticism and objections to a questionnaire, 
other things being equal, tend to decrease with the importance of the 
privileges or rights bestowed through the reporting instrument. 

A comparatively long period of time is required to produce accurate 
and complete data through a new form. On the basis of the WPB’s 
experienee, from three to six months are required from the first sub¬ 
mission of a new complicated reporting instrument to the time when 
both industry and government function reasonably smoothly on the 
basis of it. With advance publicity this period can be reduced, pro¬ 
vided the questionnaire is not too complicated nor has in it injudicious 
procedures or questions difficult or impossible for industry to answer. 

It is possible to reduce the number of data requests if a case analysis 
is made of the internal use of the data derived from the questionnaire 
and adequate provision is made within the control organization lor the 
free and prompt exchange of information. The basic categories of in¬ 
formation required for emergency management are not laige. If sound 
technical standards relating to such matters as units of measure, no- 



STATISTICS IN THE WPB 


441 


menclature, time periods and other details are carefully applied, and if 
a well-planned statistical system is used to support intelligent and suit¬ 
able control policy, there is no reason why a relatively few reports 
cannot be made to satisfy fully management’s requirements for facts. 
The experience of the WPB adequately supports this conclusion. 

The adaptation of accounting techniques to statistical methods can 
be made effectively both to support implementation of policy decisions 
and to provide the necessary factual data for review of past decisions 
and new determinations. A statistical-accounting technique can exist 
side by side with a statistical reporting S3rstem and can constitute an 
important part of the entire data-collection framework. Although the 
WPB’s material accoimting was not as accurate as finn.ncia.1 accounting, 
it performed for production control much the same functions as finan¬ 
cial accounting performs for industrial financial control. By introducing 
tolerances permitted by material accounting needs, accounting control 
was instituted and operated at a far less expenditure of time and en¬ 
ergy than financial accounting. Statistical accounting is a practical and 
necessary part of any emergency data-collection structure used to 
support control policy and its administration. 

An adequate reporting system cannot be devised overnight. In early 
1940 a tremendous mass of information was available in Washington 
pertaining to the operation of the industrial system. The data, how¬ 
ever, could not be coordinated nor oi^anized to present a compre¬ 
hensive statement of resources. Data on potential war demands were 
for all practical purposes non-existent. The concept of an integrated 
emergency reporting system was unborn. Developing the concept and 
nursing it to maturity was a diffictdt and time-consuming task. 

Experience of the WPB shows that it never fully recovered from 
these early statistical deficiencies. Makeshift schemes and organiza¬ 
tions became deep rooted. It was impossible to erase them completely. 
As the war progressed the value of satisfactory new control systems 
were jeopardized because of existing inept procedures originated at a 
time when a scramble for factual knowledge resulted in the introduction 
of any procedure which appeared capable of producing some results. 
The erection of an effective statistical reporting system was, therefore, 
doubly handicapped. 

Our experience after 1938 now demonstrates that we were as ill- 
prepared to create and use the proper techniques for collecting informa¬ 
tion as we were to wage modem war and produce quickly the weapons 
needed for its prosecution. Although we had a larger peacetime sta¬ 
tistical organization than any oilier country in the world, we were not 



442 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

prepared to provide the facts for an all-out economic effort. The re¬ 
porting system merely provided a foundation. It was comparable in 
this respect to the world's largest steel capacity and the world's largest 
industrial capacity. We had the resources; we did not have the end- 
products that we needed. 

War, we found, required not merely general facts; it demanded that 
facts be obtained on a broader basis, on a wider variety of subject 
matter, and in a shorter period of time than had ever before been 
necessary. To do this job we built upon prewar experience and tech¬ 
niques. Frequently, the needs of the emergency required substantial 
changes in available methodology. In some cases, emergency develop¬ 
ments moved so far beyond original techniques as to almost preclude 
recognition of the foundation upon which they had been built. 

For years we had collected biennially a Census of M amifcLctures 
which covered every fabricating establishment in this country. Data 
were collected for each establishment showing its employment, the 
products which it made, its consumption of electrical energy and the 
dollar value which it added by manufacture. The census was taken 
in odd-numbered years and became available to the general public 
some two years after the end of the period which it covered. In addition 
to this periodic census, we also had a large variety of current reporting 
series in the industrial field. Some of these were developed by the 
Federal government. To a larger extent they were the product of trade 
associations and business groups which cooperated in the collection of 
certain data necessary to their current operations. Over the years the 
Census data had provided a series of benchmarks against which current 
reports could be measured and projected. These were valuable sta¬ 
tistics. For the purpose for which they were designed and used, they 
were undoubtedly all that was necessary. 

The methodology and techniques were also adequate for the purposes 
for which peacetime industrial data were collected. In peacetime, data 
were collected to permit government and business administrators to 
make broad general judgments. Seldom were they used to administer 
specific actions. Business and government operated on the basis of a 
multitude of individual choices not directly related to and administered 
in conformance with national policy objectives. In wartime, the direc¬ 
tion of the economy was founded on a single policy objective made by 
government. And, what is perhaps more important, the implementation 
of that specific central judgment depended on directions issued by the 
central government. Under such circumstances statistics became the 
basis of action. The data therefore had to be accurate and up-to-the- 



STATISTICS IN THE WPB 


443 


minute. As a result, we could no longer operate with the methods of 
peacetime. A methodology had to be found by means of which pene¬ 
trating industrial data could be collected in the most accurate and com¬ 
prehensive terms and made available to administrators very quickly. 

Our wartime experience demonstrated that no matter how urgent the 
need or how willing the cooperation, an integrated reporting system 
suitable for emergency management cannot be improvised or built in 
a short period of time. The development of useful information is a 
product of knowledge, skill, experience, and time. 

If we are to profit from this experience and avoid costly errors in a 
future emergency, we must now determine upon a course of action 
which will permit us to preserve the accumulated knowledge of the past 
and to use it in molding a better system for future periods of crisis. 
We must develop not only the blueprint of the reporting system 
needed in an emergency, and relate it closely to emergency control 
plans, but we must insure that the major features of the statistical 
system arc incorporated as far as possible in the current statistical re¬ 
porting structure of government. A relatively small expenditure now 
to provide for emergency statistical preparedness, and if need be con¬ 
tinued for the next 40 or 60 years, would be minute compared with the 
costs incurred in the three years of confusion attending the recent war 
effort, or the early period of confusion which we can be sure will be 
part of any future emergency program in the absence of a continuing 
pre-emergency effort to prepare an integrated emergency industrial 
reporting system. 



BOOK REVIEWS 

Edited by 

Oscar Krisen Bitros 
Rutgers University 

Practical Business Statistics, Second Edition. Frederick E, Croxton (Professor of 
Statistics, Columbia University, New York, N. Y.) and Dudley J. Cowden 
(Professor of Economic Statistics, University of North Carolina, Chapel Hill, 
N. C.). New York 11: Prentice-Hall, Inc. (70 Fifth Ave.), 1948. Pp. xix, 660. 
Text edition, $6.35; trade edition, $4.76. 

Review by Alfred Cahbn 

Textile Economist^ Business Information Division, Dun and 
Bradstreet Inc,, 3^6 Broadway, New York 8, N. Y, 

T his up-to-date revision of the popular original 1934 edition offers in 
understandable form some of the newer techniques on sampling, quality 
control, and time series analysis. Particularly effective are the step-by-step 
clear-cut solutions of numerous problems selected from business and govern¬ 
ment statistics. It remains one of the best texts on business statistics. 

Some specific contributions notably well presented include Chapter 3, 
“Statistical Tables,” and Chapter 4, “Rates, Ratios, and Percentages.” This 
latter chapter effectively points out the inter-dependence of statistics and 
accounting. Chapters 5, 6, and 7 deal thoroughly with graphic analysis. 

Time series analysis is effectively presented in Chapters 11, 12, and 13. 
Some simplified techniques are particularly useful as preliminary adjustment 
of data for price changes, calendar variation, and population changes. 

Chapter 14 on correlation is perhaps oversimplified. The illustration on 
relationship between haidncss and tensile strength of 27 pieces of wrought 
aluminum alloy results in a coefficient of +*901. The authors indicate in the 
preface that Chapters 1 to 15 may be used for a onc-scmcster couiso and 
Chapters 16 to 22 for the second semester. Multiple correlation is dealt with 
six chapters later in Chapter 20. A realistic question arises whether presenta¬ 
tion of simple correlation alone (separate from multiple) may not overimpress 
the student. “A little knowledge is a dangerous thing.” 

Chapter 16 deals with the normal curve. Happily the authors refer only 
once to coin tossing and not at all to dice or cards. These three illustrations 
are generally a curse upon both teacher and students of business statistics 
as they depart so far from reality. One example used to illustrate the normal 
curve is life experience of wooden telephone poles. Few business series can be 
fitted by a normal curve. It was originated by astronomers and generally 
applied to physical and biological data. The family of curves of the Pear- 


444 



BOOK REVIEWS 


445 


sonian or Gram-Charlier series cannot be studied in a first year course. 
Therefore a chapter on the normal curve alone probably contributes little 
that a business statistics student will apply later in practice. However, the 
normal curve is important in the study of sampling distributions. 

Some current indexes are analyzed in Chapter 21, consisting of the 
principal governmental price and production indexes. This chapter is simply 
and interestingly written. Perhaps it could best follow immediately Chapter 
16, "The Construction of Index Numbers.” The allocation of only one-half 
page to the logistic and Gompertz curves hardly seems adequate. 

The final chapter, "Budgeting and Forecasting,” is effectively presented 
except that no mention is made of unfilled orders and inventories—^both of 
which are rather frequently used in forecasting. 

The book contains 14 appendices: including mathematical tables as values 
of t, Fj etc; useful reference tables on sums of the first four powers of the 
first 50 natural numbers; squares, square roots, and reciprocals; logarithms; 
and interesting side lights such as rounding numbers and flexible calendar of 
working days. 

So much for the high spots of the book. With so many worthy contribu¬ 
tions, one may ask what arc the limitations? That concerns primarily the use 
of the word "practical” in Practical Business Statistics. 

Practical is defined by Webster*s International Dictionary as "of, pertaining 
to, or consisting or manifested in, practice or action,” 

The emphasis in this text is primarily on techniques as indicated by the 
large amount of space devoted to the following topics: 


SUBJECT 

CHAPTERS 

PAGES 

Time series analysis 

3 

86 

Correlation 

2 

54 

Reliability and tests of significance 

2 

49 


In contrast the authors are parsimonious on problems which their students 
will most likely encounter later in jobs in practical business statistics. Some 
of the larger requirements of the operating statistician are in the following 
fields (Note the number of pages given by the authors.): 


SUBJECT PAGES 

Sources of data 8 

Detection of errors 5 

Questionnaires ^ 6 

Sample design (not including chapters on tests of significance) 7 

Collection of data by interview or mail 2 

Editing 1 

Tabulation methods 6 

Graphic presentation (adequately covered) 62 

Report writing — 


These are realistic problems which both young and old practicing statis¬ 
ticians grapple with day by day. 

Sources of data are the first step in tackling a business problem. The 



446 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEFTEMBER 1041) 


authors do provide in Appendix A a five page list of general sources on pro¬ 
duction, price, wage and similar data. A concrete problem showing a student 
how to find the pertinent information on one industry such as wood furniture 
might be more constructive. 

Detection of errors is probably the most important single problem which 
the practicing statistician encounters every day of his working life. The 
various checks and counter checks to help detect errors are scarcely men¬ 
tioned in this text. 

The selection of the proper tabulation method—^hand, machine, or semi¬ 
mechanical—^is highly important to the practicing statistician both in time 
and cost. Proper tabulation method sometimes saves several thousands of 
dollars on a single project. The six pages devoted to this subject by the 
authors oversimplify the problem by stating mechanical tabulation is de¬ 
sirable when a large number of schedules is being tabulated with many 
entries for numerous tables. This resembles the sales talks of the representa¬ 
tives of the International Business Machines Corporation and the Reming¬ 
ton Rand Company. The determining factor in the use of mechanical tabula¬ 
tion is not the size of the project. When information needs to be both multiple 
sorted and repeatedly readded, then mechanical tabulation offers operating 
economics. 

However, if it is largely a problem of multiple sorting without summations, 
the Keysort or EZ Sort systems can readily contribute less costly opera¬ 
tion than mechanical tabulation for the same results. The reviewer saw an 
operation of Keysort at the Mutual Life Insurance covering 1,250,000 ac¬ 
counts—certainly a large scale operation. The authors do not mention the 
EZ Sort system which has a patented advantage of five holes to the inch 
compared with four holes for the older Keysort system. 

In tabulations, regardless of size, where summation is the primary problem 
with little or no sorting, then the peg board system and the accounting 
machines of National Cash Register Company, both offer economics com¬ 
pared with mechanical tabulation. 

Appendix B describes several calculating and adding machines. Crcllc's 
tables offer a very economical means of multiplication, but arc not men¬ 
tioned. Although three different calculating machines and one adding 
machine are pictured, no reference is made to the comptometer. Various tests 
have shown that adding a set of figures twice on the comptometer is both 
more speedy and accurate than adding once on the adding machine and then 
reading back the tape. Tape reading involves the human frailty of an indi- 
viduaPs mind wandering instead of listening to figures. Hence errors are not 
detected. 

The authors have contributed much technical perfection and pedagogical 
skill in the revision of this volume. Some additional time could have been 
profitably spent to find out how their former students are actually putting 
statistics to use in practical business. 

Dr. Juan Kimmelman rather aptly phrased the problem in the June 1944 



BOOK REVIEWS 


447 


Journal of the American Statistical Association “In many cases, after having 
devoted several years to practical work, even in high positions, he will admit 
that 80 per cent of the knowledge which he obtained in the university repre¬ 
sents dead weight for him, and that it is neither needed nor of any possible 
use. 

The examples in this text are selected from actual operations. However, 
the sources of these illustrations are largely from governments, institutions, 
and quasi-public giant corporations, the stratosphere of American business. 
They are technical rather than typical problems encountered from day to day 
by the rank and file of business statisticians. 

All of the foregoing comments can probably be boiled down to a matter of 
relative emphasis. Formula problems are an essential part of a first-year 
statistics course. However, problems of collection of data, detection of errors, 
tabulation, and report writing are never given suflSicient space and emphasis 
in a first-year statistics course. 

The reviewer recommends this volume as a handy guide that every statis¬ 
tician will like to have on his desk for reference. Altogether it is an excellent 
text on business statistics—practical or otherwise. 


Mathematical Methods for Population Genetics. Gunnar Dahlherg (Head of the 
State Inhtitutc of Human Genetics, Uppsala, Sweden), New York 3: Inter- 
science Publishers, Tnc. (215 Fourth Ave.), 1948. Pp. vii, 182. $4.50. 

Review by Howard Lbvene 
Instructor in Mathematical Statistics and Biometrics 
Columbia University, New York, N, Y, 

O UR present knowledge of genetic phenomena in human populations is still 
in a comparatively rudimentary state, and any advances are likely to 
prove of great benefit to the human race. The problems involved are: (a) 
to learn by investigation what sort of assumptions may reasonably be made 
about the j^roccsses whicli occur; (6) to formulate precise mathematical 
models embodying these assumptions and deduce their consequences; (c) to 
estimate the parameters involved. The Scandinavian countries, with rela¬ 
tively homogeneous poi)ulation8, excellent vital statistics, compulsory report¬ 
ing of hereditary defects, and governmental interest in the problem, offer a 
particularly good opportunity for investigation of problems (a) and (c). The 
author, who is Head of the State Institute of Human Genetics in Sweden, 
has played an imi)ortant part in investigating all three problems. He has 
written a work with many excellences, combined with some defects. 

This book is a translation, with minor changes, of the author’s “Mathe- 
matische Erblichkeitsanalyse von Populationen,” Acta Medica Scandinavica, 
Suppl. 148, 1943. The author has deliberately omitted all discussion of the 
work by the Fisher school and others on estimation of mode of inheritance, 
linkage relationships and gene frequencies for particular characters; and has 
made little use of the investigations of Wright, Fisher, Haldane and others 



448 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

on evolution in natural populations. Within the author^s chosen held, the 
book suffers from the defects to be expected of a book written on an essen¬ 
tially mathematical subject by a non-mathematician. Most of the proofs in¬ 
volve verbal reasoning and are consequently hard to follow and check. 
Furthermore, the precise mathematical model involved is not always clearly 
evident. In particular, the reviewer has felt for some time that geneticists use 
the term random breeding-to cover a number of different processes that may 
lead to different results in finite populations, and the author’s discussion 
seems to have this same fault. The author also fails to distinguish between 
the actual values of gene frequencies and their expected values, although at 
other times he makes effective use of the concept of random fluctuations in 
small “isolates.” Furthermore, the modern theory of statistical inference is 
virtually ignored. In the section “Isolates and Racial Differences” a type of 
discriminant analysis is introduced which is in general incfflcient in compari¬ 
son with that of Fisher. 

The above criticism should not be taken to imply that this is not a valuable 
book. For biologists and men of affairs it discusses clearly the problems of 
heredity in human populations, and shows the importance of further research 
along these lines. The flnal chapter in particular should serve as a valuable 
antidote to the more rabid eugenists and racists. (The author states that the 
first edition was intended in part for the post war edification of the German 
people.) The wealth of tables and graphs interpreting the various formulas 
should be particularly welcome to the less mathematical reader, and the 
conclusions drawn from them are likely to be at least qualitatively correct. 

For the mathematical statistician and the biometrician, this book will be 
primarily valuable as an introduction to a new discipline of great practical 
importance which furnishes some problems of no mean mathematical diffi¬ 
culty. The reinvestigation and extension of the author’s results, using the 
theory of stochastic processes and modern methods of statistical inference, 
provides a wealth of problems at many different levels of difficulty. All 
things considered, it is a book which will well reward reading by those with 
any curiosity toward this subject. 


Principles of Biological Assay. C. W. Emmens (Head of the Department of 
Veterinary Physiology, University of Sydney, Sydney, Australia). Foreword 
by Percival Hartley. London W. C. 2: Chapman & Hall Ltd. (37 Essex Street), 
1948. Pp. XV, 206. 21s. 


Review by Lila F. Knudsbn 
StaHstidan^ Food and Drug Administration 
Washington 25, D. C. 

U NTIL just recently, there has been no one comprehensive book on statistics 
in biological assay though the need for it has been great. There have 
been izmumerable articles in scattered journals since most of the develop¬ 
ments in this field have been in the last fifteen years, but nowhere have the 



BOOK REVIEWS 


449 


various statistical technics been gathered in one book. Principles of Biological 
Assay compares with an excellent recent book entitled ProhU Analysis by 
Professor D. J. Finney of Oxford, which covers the field of biological assay 
with special emphasis on quantal (or percentage) responses and gives a rather 
thorough background in design and interpretation of such biological assays. 
Dr. Emmens' book attempts to cover a wider approach and includes quanti¬ 
tative as well as quantal responses on biological assays with one chapter 
devoted to the work of Finney and Wood on microbiological assays. 

It seems that the title Principles of Biological Assay is misleading and 
should have been Statistics in Biological Assay^ or maybe Statisticd Design 
and Analyses of Biological Assay, since the entire book deals with the statisti¬ 
cal design and analysis and not with the pharmacological aspects of assays as 
do other books in this field, such as J. H. Burn’s Biological Standardization 
or Katherine Coward’s The Biological Standardization of the Vitamins, 

Dr. Emmens is primarily a pharmacologist who has found statistics ex¬ 
tremely necessary in arriving at sound conclusions in experimental biology 
and the evaluation of strength and comparative effects of drugs. In 1939 he 
wrote a report on biological standards entitled “Variables Affecting the 
Estimation of Androgenic and Oestrogenic Activity” which was issued by 
the Medical Research Council of Great Britain. Dr. Emmens has come a long 
way, statistically speaking, since writing that report and his present book 
coveiB some of the newer important developments in statistical evaluation 
of biological assay such as Irwin’s contribution for taking into consideration 
the error of the slope in calculating the fiducial limits of error, and Finney’s 
probit plane analysis. 

The book touches on the usual introductory statistical concepts; deals 
rather completely with dosage-response lines; describes various experimental 
designs (Latin squares, balanced incomplete blocks, etc.) that have beeti used 
in biological assay; gives formulas for calculating potency, standard errors, 
fiducial limits, combining results of several assays, chi squares, etc. However, 
emphasis seems to be on the manipulation of numbers rather than the as¬ 
sumptions made and the interpretations to be given the results. Even at that, 
the worker in the biological laboratory is given no inkling that in many 
cases it is possible to greatly simplify the formulas for routine calculations. 
The organization of the book seems a bit jumbled to the reviewer. The twenty 
chapters are each very short and each section does not seem to lead logically 
into the next section. The two chapters devoted to design of experiments are 
widely separated. Various tests of significance are scattered through the book. 
An entire chapter is devoted to a rather hazy explanation of polynomial 
coefficients and when factorial coefficients are introduced three chapters 
later, the author does not mention that they include the impressive poly¬ 
nomial coefficients. Student’s t test occupies one paragraph and the only 
formula given is that for comparing a sample mean with the population value 
of the mean. Though the symbol t is used later in comparing two slopes and 
in calculating fiducial limits of the potency, nowhere in the book is an ex- 



450 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

planation given of the assumptions underlying it and the way it should be 
interpreted. 

The section on degrees of freedom seems unnecessarily complicated for 
absorption by a pharmacologist. The author apparently is trying to lead up 
to analysis of variance and factorial coefficients in two ])agcs but simply 
confuses the reader. 

Terminology can be very confusing. Dr. Einmens probal)ly didn’t realize 
that he gives three widely different concepts to C on pages 84, 86, and 105. 
It would have been possible to choose some other letters or symbols for two 
of the concepts. Another thing that may be confusing is the omission of the 
usual parentheses after the summation sign S in many equations. 

It is rather surprising to see a book of this kind without lists of references 
for additional reading. It must have been an oversight that no mention is 
made of many of the numerous articles and statistical texts that would add 
to the reader’s comprehension of the subject; although full credit is given in 
the text for examples cited and most formulas quoted. The method for 
evaluating biological assays given by E. B. Wilson and Jane Worcester in a 
series of papers published in the Proceedings of the National Academy of 
Sciences in the spring and summer of 1943 is not mentioned nor is Epstein 
and Churchman’s "On the Statistics of Sensitivity Data” in the March 1944 
issue of Annals of Mathematical Statistics nor Berkson’s “logits.” 

It is consoling to find the statem^t on page 188 “The weighted mean log 
potency should have a variance VM given by 

vW “ ^ Tm 


which should not differ significantly from the alternative estimate 


(d) 


VM = 


Su{M - My 
Sw(r — 1) 


where 


1 

VM 


and that “the best estimate of the potency available remains the weighted 
mean, which has a variance given by equation (d).” The implication is that 
even if the state of statistical quality control is known to exist the internal 
variance of an assay may give too small an error of the combined potency 
estimate. The left hand side of equation (d) contains an obvious typographi¬ 
cal error. In the book it is written as FM instead of VM. The denominator 
of the right hand side of this same equation would be clearer if written 
{r—l)Sw, 

On the whole, this book will be a welcome addition to the library of a 
pharmacological laboratory, particularly as a source book for formulas, and 
very helpful in promoting the use of statistics in biological assay. 



BOOK REVIEWS 


461 


TrafSic Performance at Urban Street Intersections. Bruce D. Greenshields, 
Donald SchapirOf and Elroy L. Ericksen, Yale Bureau of Highway Traffic, 
Technical Report No. 1. New Haven, Conn.: Bureau of Highway Traffic, Yale 
University, 1947. Pp. xv, 162. Gratis. 

Review by Henry K. Evans 

Highway Specialist, Chamber of Commerce of the United States, 
Washington, D.C. 

T his 148-page technical treatise is a highly theoretical discussion of ve¬ 
hicular traffic movement characteristics (primarily at street intersec¬ 
tions), comparing results of empirical field observations with mathematical 
calculations. The field studies employed a specialized photographic tech¬ 
nique; the theoretical calculations revolve about the use of the laws of proba¬ 
bility, especially the Poisson theory. 

It is found that the field observations of vehicular spacings and groupings, 
(in moving traffic) agree very well with mathematical estimates of these 
characteristics. Therefore support is gained for the use of the mathematical 
approach in estimating traffic behavior, and the authors devote half of the 
book primarily to applying the laws of probability to different hypotheti¬ 
cal traffic problems. The proof of the successful application of the mathe¬ 
matical method of calculating behavior patterns is well substantiated by the 
results shown in the book. However, as a useful tool to the practising traffic 
engineer or planner, the methods shown have no immediate practical value, 
principally because such precise knowledge of these behavior patterns, no 
matter how gained, bears negligible relationship to traffic engineering tech¬ 
niques commonly emploj^ed. 

For example, various mathematical methods of calculating vehicle seconds 
delay at street intersections or drawbridges are presented. The traffic engi¬ 
neer is not as interested in knowing the precise amount of delay as he is 
concerned with knowing whether or not improvements in traffic control re¬ 
sult in substantial reduction in over-all delay, which is easily observed by 
before and after field observations and simple arithmetic calculations. (The 
field observations would be necessary whether or not the mathematical re¬ 
finements were applied.) Not enough is known about economic costs of delay 
to warrant any high degree of exactitude in determining vehicle seconds 
delay. 

The highly theoretical aspect of the book is illustrated by the solution of a 
given problem of determining whether or not the frequency of obscuring 
vision (one driver’s vision of a traffic island obscured by any vehicle within a 
75-foot distance ahead) would be doubled by doubling the traffic volume. 
Use of Poisson’s theory develops the estimate that 16 per cent of the drivers 
would experience obscured vision where traffic volume averaged 300 vehicles 
per hour, and that the per cent would be 29 per cent at double the volume. 
As a mental calisthenic this method bears some merit but has no practical 
use. 



452 AMEBICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

Another similar application of the laws of probability is made to a hypo¬ 
thetical traffic signal timing problem. Here the object is to find the shortest 
signal cycle length at an intersection which will pass the entering traffic 
volumes with failure to clear out the waiting vehicles only 5 per cent of the 
time. The method is useful primarily and only for the insight it gives into the 
traffic movement capacity of a roadway as limited by signal cycle length. 
Practically there are so many other factors bearing on selection of optimum 
signal cycle length that the theoretical method has little practical use. 

Application of the Poisson theory to estimate the bunching of vehicles 
arriving at a parking lot or garage is brought out. This has a value to de¬ 
signers in making it possible to estimate the reservoir space required to care 
for vehicles arriving faster than they can be parked and is one of the most 
valuable applications shown of the mathematical approach to practical prob¬ 
lems. 

Whereas the techniques illustrated in this book have little application in 
solution of practical traffic problems, they do constitute valuable tools for 
research into generalized warrants for application of traffic control devices 
such as stop signs and traffic control signals. As yet the traffic engineering 
profession has not reached a verdict on what volumes of traffic or frequency 
of accidents justifies installation of stop signs at an intersection. The method 
shown of computing the percentage of unnecessary stops at a stop sign sug¬ 
gests that this knowledge might be used to determine that below certain 
levels of traffic volumes on two streets the percentage of unnecessary stops 
would be high enough to justify declaring that stop signs were unnecessary 
and a nuisance (provided other factors such as traffic accidents and view 
obstructions did not require such signs). 

Although this book provides no ready answers or useful tools for the traffic 
engineer, it deserves to be acclaimed a milestone in the progress of traffic 
engineering research. It is the most scholarly and scientific investigation into 
the characteristics of traffic fiow patterns yet published. 

The primary contribution to the traffic engineering profession is the proved 
application of Poisson’s probability formulae to analyses of traffic flow. 
Whereas it is doubtful that such application will prove of every day use for 
individual problems in setting signal cycle lengths, determining need for stop 
signs and other traffic facilities, or comparing relative efficiencies of different 
control devices, because of the many other non-mathematical factors affect¬ 
ing traffic flow controls and facilities; it is felt that the mathematical tools set 
forth in this report will be of inestimable assistance in performing basic re¬ 
search for the purpose of establishing general standards and warrants for 
different methods of control, sizes of traffic facilities, etc. 

This method of traffic engineering mathematical analysis is akin to the 
first mathematical studies of nuclear physics. Whereas it has negligible im¬ 
mediate practical application, it is the entryway into a better understanding 
of the principles involved in traffic movement, which will inevitably result 
in practical application. 



BOOK REVIEWS 


453 

Theory of Probability, Second Edition. Harold Jeffreys (Plumian Professor of 
Astronomy, University of Cambridge, Cambridge, England). London E.C.4; 
Oxford University Press (Amen House, Warwick Square), 1948. Pp. viii, 411. 
30s. (New York 11: Oxford University Press [114 Fifth Ave.]. $9.00.) 

Review bt Herbert Robbins 
Associate Professor of Matheinatical Statistics 
Institute of Statistics^ University of North Carolina 
Chapel Hitt, N. C. 

T he second edition of this work on statistical inference differs from the 
first mainly in the addition of a theory of invariance aimed at establishing 
the logical consistency of the author’s rules for setting up prior probabilities. 
The whole theory, based on Bayes’s principle of inverse probability, lies 
entirely outside the modern development of statistical inference in the work 
of Neyman, E. S. Pearson, and Wald. Its relation to Fisher’s work is more 
obscure; according to Jeffreys his results are in general agreement with 
Fisher’s, although the underlying reasoning is not always the same. 

Perhaps the most striking feature of Jeffreys’ work is that his methods for 
testing hypotheses or estimating parameters take no account of the cost to 
the experimenter of accepting a false hypothesis or using an erroneous esti¬ 
mate. For example, a null hypothesis is to be rejected when the ratio of its 
posterior probability to that of the composite alternative is too small, the 
prior probabilities being determined by a set of conventions into which the 
cost of making errors does not enter. This procedure, which is based on an 
axiomatic concept of probability that has nothing to do with relative fre¬ 
quency, is not supported by any operational justification. Within Jeffreys' 
theory one cannot ask, let alone answer, whether the procedures are better 
or worse as operating rules than others which might be put forward as alter¬ 
natives. This was indeed the situation in statistical inference thirty odd years 
ago. 

Whatever one’s views on the foundations of statistical inference, one can¬ 
not fail to profit by the mathematical ingenuity, keen physical intuition, and 
common sense which Jeffreys brings to bear on a wide variety of practical 
problems. Statistical theory will continue in large measure to find the inspira¬ 
tion and motivation for its real advances in the practice of the best statis¬ 
ticians. The appearance of the second edition of a book as provocative and 
rich in content as Jeffreys’ Theory of Probability is a welcome event. 



464 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 


Rank Correlation. Maurice G, Kendall (Joint Assistant General Manager and 
Statistician, Chamber of Shipping of the United Kingdom, London E.C.3, 
England). London W.C.2: Charles Griffin & Co. Ltd. (42 Drury Lane), 1948. 
Pp. vii, 160.18s. 

Review by E. J. G. Pitman 
Professor of Mathematics, University of Tasmania 
Hohart, Tasmania 

I N THE words of the preface: “Until a few years ago rank correlation was a 
rather neglected branch of the theory of statistical relationship. In the 
practical field it was generally regarded, except perhaps by psychologists, as 
a makeshift for the correlation of measurable variables; and in the theoreti¬ 
cal field it seemed to present no interesting or important problems. That 
situation has changed. Practical applications of ranking methods are not 
only being extended in psychology and education but are being made in 
other subjects such as industrial experimentation and economics. The theo¬ 
retical properties of order-statistics have received much attention and are 
throwing important light on some difficult questions of statistical inference.” 

As the title indicates, this book deals only with the application of ranking 
methods to testing independence and to estimating degree of dependence 
of chance variables. The basic problem is, given n pairs of values of the 
chance variables X and F, to decide whether X and Y arc independent or 
not. By replacing the observed X values by their ranks, and the Y values by 
their ranks, we are able to test the hypothesis of the independence of X and 
Y without making any assumptions about the forms of the distributions. 
Moreover, we can do this when the values of one or both variables cannot be 
properly measured but can only be arranged in order of magnitude. 

The book gives a clear and comprehensive account of the properties of 
Spearman’s coefficient of rank correlation p, and of the coefficient r, which 
depends essentially on the number of inversions in the ranks of one variable 
when the ranks of the other variable arc arranged in natural order. Neither 
here nor elsewhere has the latter coefficient been given a name in English. 
The complications arising from ties are thoroughly treated. 

The problem of m rankings, and partial rank coiTclation arc dealt with, 
and also the relation of p and r to tlie population correlation coefficient in 
samples from a normal bivariate population. The last two chapters discuss 
paired comparisons, as when an observer comi)arcs n objects two at a 
time without necessarily in the end arranging the n objects in a linear order. 
The tables necessary for making all the tests described in the book arc pro¬ 
vided. 

Lack of precision in statement and in notation, and insufficient explanation 
sometimes make the author’s argument difficult to follow. For example, on 
page 58 we have 



book reviews 


455 

wheieT^/ denotes summation over values for which j andj^^^ over values 
for which i and j s^Z. 

The summation in the first line is intended to include only terms in which 
the two first suffixes are the same or tied, and the two second suffixes are tied, 
i.e., terms like Qtjbjt are not to be included. It is natural to assume that the 
same convention applies to the first summation in the second line, and this 
is in accordance with the statement about The author then goes on to 
show that the mean value of]^" is 0; but this requires i i ?^Z, and j 9 ^k, 
j 9 ^ 1 , If that is so, the fiist summation must include terms in which the 
suffixes are the same as the suffixes, but in reverse order, so that terms like 
are to be included. (Note: a„ = « -&»,.) 

Again, on page 61 it is stated that when summations extend over values of 
suffixes which exclude ceitain values (e.g., i^^k) we can replace them by 
complete summations. This will puzzle the conscientious reader, as no reason 
is given. What is meant is that we can do this without affecting the dominant 
terms of their mean values. This is explained later, page 66, in the discussion 
of a moic general case. 

It is an excellent idea to publish a monograph on a selected portion of 
statistical theoi-y, and it is to be hoped that other workers in statistics will 
follow this author’s example. 


Report on the Scheme for the Improvement of Agricultural Statistics. F. G 
Panse. Imperial Council of Agricultural Research. Delhi, India: Manager of 
Publications, 1946. Paper, 3s, 9d, 

Review by S. Lee Crump 
Iowa Stole College^ Ames, Iowa 

T he brochure under review is arranged in eight chapters and four ap¬ 
pendices. The main chapters discuss the following topics: statistics of area 
under agricultural crops, crop cutting experiments, forecasts and estimates 
of yield, sampling surveys for estimating yields. The remaining chapters are 
of a more general or incidental nature. 

The total production of a crop in India is currently estimated from the 
following formula: Aiea XNormal YieldXSeasonal Factor, Dr. Panse has 
approached the problem of improving agiicultural statistics with this formula 
as a take-off point. Each of the first two factors is discussed in considerable 
detail while the third is treated briefly. 

The area under agricultural crops is assumed to be known with a high 
accuracy since an annual census is taken. The chief problems in this connec¬ 
tion are those arising from mixed crops and from divided fields. Suggestions 
are made for reducing the error from these two sources. It seems to this re¬ 
viewer that Dr. Panse may underestimate the importance of other errors 
which seem always to introduce incompleteness in censuses. 

By far the greater part of the discussion is devoted to the problem of esti¬ 
mating yield. Dr. Panse has undertaken a thorough study of the methods of 



456 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 

estimating yield in current use. They all suffer from subjectiveness at almost 
every stage. The crop cutting experiments which should provide the best 
estimates of yield are not properly conducted, and are frequently ignored. 
Throughout the discussion the point is made that “normal yield” is an ill- 
deffned concept and should be abandoned in favor of estimating the yield 
directly each year. 

Finally, the results of a random sampling survey for estimating the yield 
of wheat are presented and discussed. As has always been the case, the ran¬ 
dom sampling survey method seems to offer the only economical scheme for 
obtaining reliable estimates. 

The first three appendices give the details of the operation of the sample 
survey. In Appendix 4 two papers by V. G. Pause and R. J. Kalamkar on 
sample surveys are reprinted. 

Dr. Pause should be commended for his approach to the problem. At each 
stage of the investigation he has attempted to suggest improvements in the 
estimates within the existing framework. That the final results indicate the 
overwhelming superiority of random sample surveys is not because Dr. 
Pause has arbitrarily rejected the existing scheme for making estimates. 


Methods of Esthnatixig Vital Statistics of Fish Populations. William E. Ricker 
(Professor of Zoology, Indiana University, Bloomington, Ind.). Indiana Uni¬ 
versity Publications, Science Series No. 15. Bloomington, Indiana: Indiana 
University Bookstore, 1948. Pp. v, 101. Paper. $2.00. 

Review by Charles M. Mottlet 
Operations Analyst^ Headquarters United States Air Force 
Washington 26, D. C. 

T he vital statistics of fish populations have undoubtedly been intriguing 
from the earliest times. The first fishermen must have wondered why the 
fish were abundant at one time and not the next. The surface of the water 
presents a bariier which man can seldom penetrate. The human census taker 
can draw representative samples, interview his subjects directly and gather, 
with relative ease, the information that he needs regarding the state of the 
population. The fishery biologist must resort to indirect census-taking meth¬ 
ods if he wishes to determine the stock on hand or to measure fiuctuaiions in 
abundance in relation to different environmental conditions or changes in 
fishing regulations. 

The abundance of species that are exploited commercially is a matter of 
considerable economic importance. The accuracy of forecasts of relative 
abundance is of great concern to fishing interests. The California sardine 
fishing industry has just experienced an unpredicted and little understood 
period of scarcity which has caused considerable economic loss. 

There is a growing list of publications on the subject of the vital statistics 
of fish populations and a considerable proportion of the funds for fishery re- 



BOOK REVIEWS 


457 


search is spent in this held. Dr. Ricker’s work is one of the first attempts to 
bring together the body of modern knowledge on the subject. He has worked 
actively in this held himself. His own investigations have included original 
work on many species of freshwater and anadromous hshes. This hrst hand 
experience has been supplemented by a close study of the literature on many 
marine species, such as the Pacihc halibut, the California sardine and many 
others which are referred to in his publication. 

The subject of vital statistics is usually approached by hshery biologists 
through the data on the catch. After the introductory chapter, wldch dehnes 
the problems and presents certain important numerical relationships and 
terminology, Dr. Ricker devotes two chapters to the interpretation of catch 
curves. These curves show the relationship between age and frequency, and 
provide a means of gaining knowledge of recruitment and mortality. The 
age of hsh can be determined by interpreting the growth markings on the 
scales or bony structures, or by noting the length frequency groups in the 
population. 

A more direct method of deriving vital statistics is to mark some of the fish 
that are caught and return them to the water. The marked individuals can 
then be identified when they are recaptured. Tagging is frequently used, so 
that individuals can be recognized. The number of marked or tagged fish in 
the catch is used to calculate the rate of exploitation of the population and to 
determine the abundance. Dr. Ricker presents the different methods for 
computing these values and points out the limitations to marking experi¬ 
ments. He also lists the assumptions on which the validity of such population 
estimates is based. Dr. Ricker devotes six of his nine chapters to the fasci¬ 
nating problems of deriving vital statistics from the results of marking experi¬ 
ments. 

He has included a useful "actuarial” appendix. It gives numerical values 
of the functions needed in this type of work, including the instantaneous 
mortality rate, the annual mortality rate and the annual survival rate. 

Although Dr. Ricker’s work will be of most value to fishery "actuaries,” 
others will find it extremely useful as a source of ideas and methods. Military 
scientists have made use of Volterra’s treatment of the problems of compe¬ 
tition between species to develop combat models. Dr. Ricker’s work may 
also prove to be useful in the apparently unrelated field of combat attrition. 

The work is undoubtedly a first edition. It is hoped that it will be enlarged. 
In future editions some mention should be made of the approach suggested 
by Dr. D. B. DeLury and this reviewer, which utilizes the data on the catch 
per unit of effort to derive direct estimates of the total population under 
certain conditions. 



458 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 


Elementaxy Statistical Analysis. S. S, Wilks (Professor of Mathematical Statis¬ 
tics, Princeton University, Princeton, N. J.). Princeton, N. J.: Princeton Uni¬ 
versity Press, 1949. Pp. xi, 284. Paper. $2.50. (London E.C.4: Oxford University 
Press [Amen House, Warwick Square]. 14s.) 

Reviewed by T. A. Banceopt 
Associate Professor^ Statistical Laboratory, Iowa State College 
Ames, Iowa 

C RITERIA must be established for a critical evaluation of the plethora of 
new textbooks and new editions of old texts on elementary statistics 
appearing in the postwar period. The usual questions asked in textbook ap¬ 
praisals in general are: Is new material included, i.e., has the author taken 
cognizance of recent methodological advances and research applications? 
Again, has the material been arranged in a fresh and stimulating fashion 
without sacrificing soundness? Final judgment, of course, might ultimately 
rest on particular questions regarding clarity, simplicity, problems included, 
references listed, errors, type, general appearance, etc. 

It is the reviewer^s opinion, however, that, for texts on elementary sta¬ 
tistics, a further question is in order, i.e.: What is the particular teaching pro¬ 
gram in statistics for which the text is designed? That is, was the text de^ 
signed to be used as a beginning course in statistics in: (a) a single field of 
application such as economics, biology, agriculture, education, etc., or (6) 
in a mathematics department as another course in mathematics or a general 
service course for applied fields, or (c) in a department of statistics as a basic 
prerequisite course to advanced statistical courses or a general service course 
for applied fields? Presumably a text designed for a general service course for 
applied fields might be used either as a mathematics’ or statistics’ depart¬ 
ment offering provided mathematical prerequisites were the same. Also, a 
basic prerequisite course to advanced statistical courses might also use ma¬ 
terial adequate for a general service course. A companion question is: At 
what college-year level will the course be offered? 

Professor Wilks has attempted with the material and arrangement of this 
new text to follow the recommendations of three distinguished committcas 
on the teaching of statistics regarding the introduction of a basic elementary 
course available centrally to all students needing an understanding of statis¬ 
tical concepts and techniques common to all fields of application. Quoting 
from the preface. Professor Wilks says, “This book has been prepared for a 
onensemester basic course in elementary statistical analysis which, at Prince¬ 
ton, is the introductory course for all fields of statistical application, and is 
usually taken in the freshman year. It is especially designed for those who 
intend to go into the biological and social sciences. It presupposes one semes¬ 
ter of elementary mathematical analysis covering topics such as those in¬ 
cluded in the first half of F. L. GriflSn’s Introduction to Mathematical Analy¬ 
sis.” 

The text, then, is designed as a basic general service course, presumably to 



BOOK BEYIEWS 


459 


be offered in a mathematics or statistics department at the latter part of the 
freshman year. It should be noted that the mathematical prerequisite implies 
an acquaintance with the elements of calculus which would apparently limit 
the availability of the text for freshman use to those colleges in which the 
elements of calculus are taught in the first semester or first two quarters of 
the freshman year. Within such a college the availability of the text would 
further be limited for the most part to freshmen whose curricula require the 
prerequisite mathematical analysis course. In some colleges this would 
eliminate altogether the students for which the text was primarily prepared, 
i.e., those in the biological and social sciences. In such cases a possible solu¬ 
tion might be to follow still further the recommendations of the three com¬ 
mittees mentioned earlier in setting up two basic elementary courses in sta¬ 
tistics with different mathematical prerequisites. 

A very good indication of the contents of the text may be obtained from 
the chapter headings: Introduction, Frequency Distribution, Sample Mean 
and Standard Deviation, Elementary Probability, Probability Distributions, 
The Binomial Distribution, The Poisson Distribution, The Normal Distribu¬ 
tion, Elements of Sampling, Confidence Limits of Population Parameters, 
Statistical Significance Tests, Testing Randomness in Samples, and Analysis 
of Pairs of Measurements. No attempt has been made to include a discussion 
of the analysis of variance or more sophisticated problems in statistical in¬ 
ference since the material is designed for a one semester course only. 

In alloting ten chapters to the elements of sampling statistics and only 
three to descriptive statistics, the reviewer believes that Professor Wilks has 
made a valiant attempt to provide pertinent material for a modem basic 
course to be centrally taught in elementary statistics. The text is remarkably 
free of typographical errors. No list of references is included. Although it is 
stated that the material is primarily for students in biology and the social 
sciences, many problems and examples are taken from engineering, industry, 
and the physical sciences. The reviewer believes that more attention should 
be given to experimental sampling and that a basic course in statistics should 
be con<luctcd with benefit of a computing laboratory. 

Professor Wilks’ experiment in teaching such materials in a central 
freshman-lcvel course should be observed closely. It represents an approach 
by a mathematical statistician to provide teaching material for this type 
course. Attempts to satisfy this need from another approach are being made 
at other institutions by applied or experimental statisticians. In the review¬ 
er's opinion, the two approaches should supplement rather than compete 
with one another. Possibly ideal teaching materials for such a course will be 
the outcome of cooperative efforts of several statisticians with as many view¬ 
points. 



LETTERS ABOUT BOOKS 


Readers are tn/oited to submit leUere about etatisticdl methodology hooka for pubUcor- 
tion in this forum. Condae, informative letters which supplement premoualy pu^ 
liahed reviews by pointing out specific strengths, weaknesses, errors, and errata in 
currency used hooks are wanted. Criticisms based on actual use of a hook as a text 
are especially desired from staiistica instructors. Other liters may consist of aug- 
gestionafor the writing of hooks and reviews. Letters which contain, adverse criticisms 
of JouBNAii reviews wUl be submitted to the author of the review for any rejdy he 
may care to make. Contributors are requested to avo^ persorudiiies. The right to 
decide whether a Idler merits puhlieation is reserv&i. Letters should he sent to the 
review editor, Oscar K. Buros, Rutgers University, New Brunstoick, N. J . 


EXPERIMENTAL DESIGNS IN 
SOCIOLOGICAL RESEARCH 

I K THE literature of social science, 
two uses of the term “experiment* 
appear: first, to stand for trial and 
error attempts at the resolution of 
some problem of human relations (this 
is the popular usage); and second, to 
describe the method of precisely con¬ 
trolled observation used in physical 
science. It is to be expected that my 
use of the term, “experimental designs 
in sociological research,* to mean ob¬ 
servations of social relations under 
conditions of control introduced by 
matching, in my recent book. Experi¬ 
mental Designs %n Sociological Research 
(New York: Harper & Bros., 1947), 
should be misunderstood and hence 
challenged in reviews by mathematical 
statisticians Kempthorne,^ Ackoff,* 
Keyfitz,* and Hagood,* to mention a 
f(*w. In fact, at least one sociologist, 
llornell Hart,* has made substantially 
the same comment and suggested that 
the preferred term mi^ht be, “sta¬ 
tistical comparisons with matched 
control groups. * 

The first use of the term “experi¬ 
ment,” as a trial and error attempt to 
influence social relations by social 
action, I specifically exclude as the 
meaning of the nine studies summa¬ 
rized in my book (pp. 22-28.29-33). My 
interest was to illustrate tne crude be¬ 
ginnings of efforts to observe, under 


conditions of control by matching, 
what really happened to people when 
such trial and error experiments, tak¬ 
ing the form of programs of social 
treatment or social reform, were used 
to influence them. It was my purpose 
to show that the systematic study of 
social action (trial and error “experi¬ 
ments*), is necessary if we are to ap¬ 
praise objectively the results often 
claimed for such programs. 

The second moaning, I have also 
disclaimed (pp. 4-6, 26, 29, 32-33), 
although the term “experiment* was 
sometimes used as an abbreviation of 
the more cumbersome term, “experi¬ 
mental designs in sociological re¬ 
search,* but when so used its reference 
was made clear in the coni ext in which 
it occurred, oven if the original author, 
whoso work was summnriz<;d, used tho 
term carelessly (Chap. 4). 

These explanations and qualifica¬ 
tions still leave us with the need of a 
term whi(‘h may he used to describe 
studies of problems of social relations 
in which: (1) a group of persons who 
receive a program of treat ment is com¬ 
pared with (2) a group ox<*lu(led from 
this treatment; the situation is, (3) 
such comparisons take place in tho 
natural community (not in tho artifi¬ 
cial situation of the laboratory or in 
the class room situation); and finally, 
(4) embrace an attempt to control by 
matcldng some of the factors in the 
situation other than, (a) the pattern 


» Kempthome, O. J. Am, Slot. Assn., 43; 489-192, September 1948. 
i Aokoff, Russell L. Sd., 107: 509-510, May 14,1948. 

'• Keyfitz, Nathan. Am. J. Sodbl., 54:269-260, Novembei 1948. 

* Hagood, Margaret Jarman. J. Am. Stat. Assn., 44; 312-813, June 1949. 
B Hart, Homell. Social Forces, 27: 96-98, October 1048. 


460 





LETTERS ABOUT BOOKS 


of treatment factors, and {h) the pat¬ 
tern of response factors. Studies of 
this sort T have called experimental 
designs in sociological research. They 
are not, of course, “experiments” in 
Fisher’s meaning of the term, and I 
have made no such claim. 

The studios which 1 summarized 
were conducted in the complex com¬ 
munity situation—henco the qualify¬ 
ing terms, “in sociological research.” 
I suspect that the full import of these 
latter words can hardly be meaningful 
to any one who has not done empirical 
research in this field of complex social 
forces, although several reviewers® 
seem quite aware of tlicir significance. 
1 am still of the opinion that the desig¬ 
nation, “experimental designs in so¬ 
ciological research” is a more mean¬ 
ingful descriptive title for these studies 
than Greenwood’s^ single term “ex¬ 
periment” (that promises too much) 
which ho applied to several of these 
studies, or llart’s* “controllc<l com¬ 
parisons” (which promises too little). 
No doul)t usage will det<‘rmino the ac¬ 
ceptance and survival of these terms, 
as it docs all other terms. 

Inadequate definitions of the uni¬ 
verses sampled is another point of 
criticism. As to this point, several of 
the investigators whoso work I de¬ 
scribed did not, unfortunately, offer 
any such definitions. The chief expla¬ 
nation appears to bo that the purpose 
of each study was to observe what 
happened to specific groups of sub¬ 
jects when treatment was or was not 
applie<l. The studies were exploratory 
and none was designed to provide a 
basis for generalization to a universe. 
Hi nee, liowover, the criticism is well 
taken it may be worthwhile to note in 
passing that there are at least two 
types of universe concerned where 
social attitudes are measured. First, 
th<TO is the universe of respondents in 
a defined area at a given time. An^a 
sampling technique is the preferred 
procedure to be followed in all such 
cases. Uut two hypothetical universes 
are also iuvolvc<l ami as yet little is 
known about the sampling techniques 


461 

to be employed in such cases. One of 
these universes is that of all possible 
variations in the attitude of each per¬ 
son on a given issue over a period of 
time. This universe may be further 
subdivided into public attitudes and 
private attitudes, a problem of con¬ 
siderable practical as well as theoretical 
importance. Then there is the universe 
of all possible questions that may be 
formulated to elicit the true attitude 
of a person on a given issue. Stouffer® 
has recognized this problem by his 
statement, “Compared with the sam¬ 
pling of respondents, the develop¬ 
ments in the sampling of items are still 
in a relatively primitive state.” These 
points illustrate the great complexity 
of the concept of universe in socio¬ 
logical research. 

The impossibiUty of randomization 
of the treated group of persons and the 
untreated group, presents the most 
serious problem, and is a well-deserved 
criticism if randomization has been 
neglected through ignorance, but ran¬ 
domization was not possible as fully 
explained in considerable detail (pp. 
168-169). To state the problem briefly, 
no administrator of a private treat¬ 
ment program or of a legalized public 
reform effort would permit the recip¬ 
ients of treatment to bo chosen at 
random. The local mores of the com¬ 
munity or the public law that governs 
the program determines eligibility for 
tniatmont, whether it be by social case 
work, relief or public housing, by need 
for such treatment. In all such studies 
of programs wo deal with a de facUt 
situation which the observer cannot 
control. It would bo an important 
contribution to research if some way 
could be found to get around this 
si^rious limitation. Furthermore, the 
consequences of such a limitation both 
m it precludes goiuiralizatioii from sam¬ 
ples to a universe, and as it precludes 
control of unknown factors, is also 
discussed in detail (pp. 60, 78, 83, 89- 
90^ 140-141,166-169, 179-186) so that 
t.his point was by no moans neglected. 
It may be also worth noting that be¬ 
cause randomization was not possible 


• Sco roviews by llagood {op. «<.), 11. F, Slotto (in*. J. Opinion 2:412-413, Full 1048), 

Doimld E. Super (J. AppL. Psychol.^ 33: 03-05, February 1040), T. G. Andrews (PsycftemsfriXra, 13: 
281-283, December 1048), and Otis D. Duncan {Hural SocUA.t 13:100, June 1048). 

7 Greenwood, Ernest. Exporiinentcd SoouAogy. New York: Kind’s Crown Press, 1045. Pp. zv, 153. 
> Hart, op. dt. p. 98. 

* Htouffer, S. A. *Govornmont and the Measurement of Opinion,* p. 436. Sei. Mo., 63: 435r-440, 
December 1046. 



462 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1949 


no effort was made to use the ^werful 
tool of analysis of variance. This re¬ 
striction seemed to the author such an 
obvious logical consequence that no 
explanation of detailed reasons seemed 
necessary, although by implicatioii at 
least, the reviews by mathematical 
statisticians suggest use of analysis of 
variance. Perhaps the best statement 
on this point is still that made by 
Margaret Hagood^® in 1941^ “Are the 
methods of anabrsis of variance (and 
covariance) applicable for observa¬ 
tional situations where data are ob¬ 
served in the cross classifications in 
which they are found rather than in 
cross classifications in which they have 
been randomly placed in an experi¬ 
ment?” 

Since the sociologist seems at this 
stage of development of his field re¬ 
search to be denied the advantages of 
analysis of variance, resort is made of 
necessity to the covergence of such 
evidence as he has: (1) the occurrence 
of small differences that are in the same 
direction of logical meaning (pp. 42- 
46, 49, 103-106); and (2) the per¬ 
sistence of such differences in more and 
more homogeneous samples after losses 
from mobility, death, refusals and in¬ 
ability to match have taken their toll. 
Patterns of difference which still per¬ 
sist after extreme cases have been lost, 
are regarded as evidence of real differ¬ 
ences, and hence may be taken as evi¬ 
dence of the effects of treatment, sub¬ 
ject alwa 3 ’’s to tho qualifications set by 
absence of control of unknown factors. 

Questions naturally arise as to the 
validity of applying the conventional 
tests of statistical significance to differ¬ 
ences between the statistics of samples 
that are non-random. Ilero again, I 
can but refer the reader to a somewhat 
detailed discussion of this point (pp. 
176-186) wherein the qualifications 
and limitations are examined with the 
conclusion that only empirical results 
can be claimed, or as one reviewer puts 
it my solution is, “a pragmatic rather 
than a fully analytic solution. 

My concept of an ex post facto ex¬ 
perimental design will have to rest on 
evidence yet to be gathered. Whether 
it will weather the tests of results in 
future research onl}^ time can teU. 

Differences of opinion on the mean¬ 
ing of the concepts “null hypothesis”* 
and “causation” appear in tne litera¬ 


ture of statistics and of logic. As to the 
null hypothesis my use follows that of 
Lindquist,” who appears to follow an 
earlier edition of Fishcr^s Design of 
Experimenls^ in which the concept was 
used to denote any exact hypothesis 
that one may be interested in disprov¬ 
ing, and not merely the hypothesis 
that a certain parameter is zero, which 
later has come to be the conventional 
meaning among some groups of stat¬ 
isticians. Some other statisticians 
seem also to use the meaning 1 attach 
to the term.” For the sociologist, 
whose universes are complex and difii- 
cult to define, parameter values are 
seldom known; but there are real ad¬ 
vantages in negative statement of 
relationships with the object of dis¬ 
proving them, because as I try to ex¬ 
plain (pp. 70-73, 83-84, 93. 137-138, 
167, 187) such formulations help us get 
rid of normative considerations. Criti¬ 
cisms of my use of the term seem to be 
based upon the contention that it de¬ 
parts from the current conventional 
usage in biological experiments and in 
biometrics. There are, however, exam¬ 
ples of different usages of the same 
technical concept or term in other 
areas of research. For example, the 
term “ambivalence” has one meaning 
in chemistry and a different but ac¬ 
cepted meaning in psychiatry; and the 
term “rationalization” has one moan¬ 
ing in systematic economics and a dif¬ 
ferent meaning in abnormal psychol¬ 
ogy. 

As to my use of the concept “causa¬ 
tion,” it will be noted in tho first place 
that no proof of causation is claimed 
for any of the studies summarized (pp. 
50, 74, passim)^ that replication of 
each study is rocommendod as the 
next step toward fuller understanding 
of the relationships (pp. 31, 57, 90, 
120, 139, 170^ 177, 185, 188, 189), and 
that my definition of causation (pp. 
52-54) is consistent with the position of 
logicians as represented in the recent 
book of Herbert Foigl and Wilfrid 
Sellers, Readings in Philosophical An¬ 
alysis (New York: Appleton-Century- 
Crofts, Inc., 1949). 

F. Stuart Chapin, Professor of 
Sociology, Chairman of the De¬ 
partment, and Director of the 
School of Social Work, Univer¬ 
sity of Minnesota, Minneapolis, 
Minn. 


Statistica for Sociologiata, p, 686. New York: Reyxial and Hitdicook, Ino., 1941. Pp. viii, 934. 

11 Duncan, op. cit. 

u Lindquist, E. F. StiUiatieal Analt/au in Educational Reaearch, p. 15. Boston, Mass.: Houftliton* 
Mifflin Ck>., 1940. Pp. «ii. 266. 

u Hagood, op. dt., p. 869; and Snedecor, George W. StaUstUnd Methods, p. 64. Ames, Iowa: Col- 
]«8^te Press, Ino., 1937. Pp. ziii, 341. 




JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Volume 44 December 1949 Number 248 


autici.es 

Statistics of the Kinsey Ueport.W. Allen Wallis 463 


The City lUock as a Unit for Uecording and Analyzing Urban Data . . 

.Edward B. Olds 485 

The Uclatioji of tiie N«‘t Reproduction Rate to Other Fertility Measures . 

.T. J. Wooftbr 501 

On Estimating tiie Mean ami Standard Deviation of Truncated Normal 

Distribuliouh.A. C. (k)nKN, Jr. 518 

On Some Mat hemalical Problems Arising in the Development of Mendolian 

(lenoties. Hilda (iwriUNOBU 526 

The Fitting of Logistic Chirvcvs by Megans of a Nomograph. 

.hjiKiKNE A. Rahor 548 

On the Best (Jhoiee of Sample Sizes for a t-Test when the Ratio of Vari¬ 
ances is Known. John E. Walsh 554 

Note on Some h]rrors in “The Evidence of Periodicity in Short Time 

Series”. Armen A, Alcuian 559 

William Lane Austin (1871-1949) 

James (Jlyde (hipt (1888-1049). Stuart A. Rice 565 

Index of Journal, Volume 41, 1919 (Nos. 245, 246, 217, 248) .... 683 

Articles, by auth<»r.587 

Book Review's, by author .588 

HOOK REVIEWS 

David, F. N., Probability Throry for Staliafiral Mrlhods . 

..loiiN W.TuivWy 567 

llRitoAN, O., Qualily l^onlrol by tSlaliHliral Alethodn , . Paul Peach 569 

Johnson, Palmer ()., iStatkticnl Methods in Research . 

.Fuederk'k Mosteller 570 

Psychological Hiaiisiics , . Edmund (hr ur<3H1ll 572 

Eissbk, II., Qualify Control in Production: A Machine-Shop Manual on the 
Statistical Method of Controlling Product Quality During Manufacture. 
.11. A. Freeman 674 

















Schumacher, F. X., and Chapman, R. A., Sampling Methods in Forestry 

and Range Management, Second Edition . . . Walter II. Mbybr 575 

Wiener, Norbert, Cybernetics .... Sebastian B. Littauer 577 

YA-Tm, FuAiffK, Sampling Methods for Censuses and Surpt ys . . . . 

' ' \ .W. Edwards Dkmino 580 


Index to Volumes 1-34,1888-1939, may be obtained from the ASA. The Journal 
is also indexed in the Industrial Arts Index and the Public Affairs Information 
Service Bulletin. 




JOUllNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Number aftS DECEMBER 1949 Volume 4/t 


STATISTICS OF TUB KINSEY REPORT* 

W. Allen Wallis 

Urdveraity of Chicago 

I N A PKWFACM Uial (l<)scril)cs my own preconceptions and attitudes in 
ai)pronchinK the Kinsf'y report, Dr. Alan Groji^g of the Rockefeller 
Foundation, which “contributed a major portion of the cost of the pro¬ 
gram” (p. vii), ol)S(‘rv<'s that “the history of science is part of the liistory 
of the fr(»edoin to obs(‘rve, to retlect, to experiment, to record, and to 
hear witness. It. has b(*(*n a iK'rilous and a i)aHsionate history indeed, 
and not yet end<Ml. . . . 'Phe findings of Dr. Alfred C\ Kinsey and his 
associates at Indiana University deserve attention for their oxUrnt, 
their thoron} 2 ;lin(‘ss, and their dispassionate objectivity. , . . Certainly 
no aspo<‘t of human biology in our current civili^salion stands in more 
need of scientifhj knowledge and courageous humility than that of 
sex. . . . TIknsc studi(»s are sincere*, obje'ctivo, and determined cxploni- 
tions of a lieM manif('s(,ly inpiortant to education, medicine, govem- 
nu*nt, and the int(‘grily of human con<hict generally. They have de¬ 
manded from Dr. Kinsey and his colleagues very unusual tenacity of 
purpose, toI«*ra!ice, analytic.al <*ompct(*nce, so(‘ial skills, and real cour¬ 
age*. I liope I hat the r<‘ad('r will match the authors with an equal and 
appropriafe* m(*asur<* of cool att<*ntion, courageous judgm(*nt, and scien¬ 
tific <*<iuanimily” (pp. V \i). 

* Alfnsl (' Kumy, Wnnl<'ll U. Coinoiov, ami <My<h‘ K. Mnrfin, iicrunt lirfuwior in the Human 
Afale.'W, a. S.‘laIl<l(*l^(^>l{lp:lny• IMiiIiuIclphia iiikI liomloti, UU^. I'p. xv Sd.AO. TluUxmk ia com¬ 
monly lofoiKMl Id Htiiiply sm "llu* Kiinoy oincM* KinM*y initiiilod the* piojoH in UKIK (p. 10), 

iind did about iotii H<*vciitlis ot all (ho intoiviownig that luul boon dono up to tbo <lato of publication 
(p. 11). Siitiil.iily, it m coiiitiion to wiito "Kiiua^y* where “Kinsey, Voineroy, and Martin" or “tho au- 
thoM” wmihl be coriot i. '1 iiese »l>br»viat <sl forms of refoienea are used tlirourboui thitt paper. 

'I'liiH pjiper IS a levision of one piejuired by invitation of tho 11)48 Cionam Oommiltwi of tho 
Anu^ricun Sliiti'itieal Xs'fU'i.Uion and presented at (Jlovcland on 20 December 104J>. Some of tho mnto- 
liul was pre.ieub'd to (he Society lor Social He ASiich ut the UnivorHity of Chicago on fi April 1948. 

The rndes have been collected at the eiul ol the paper, ainoo they ahiMild be rea<l after readinfr tho 
wliol<.i ttf the test. 1’hoy aro reoHoiiahly nolf-oontained and can bo rend without reference to tho text, 
but their nuiubiSh have been iiicltule<l in braeketo in the text tt» indicato points that arc expanded in 
the notiss. 



464 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

Sexual behavior is an important and interesting subject that has 
been considered intensively and extensively from nearly every view¬ 
point except the simple, factual one of what people actually do. Now, 
however, we have in the Kinsey report an intensive and extensive col¬ 
lection of facts about overt sexual behavior objectively defined. The 
report is, in short, a statistical study of sexual behavior. It presents 
figures on the varieties of sexual activity and the frequency with which 
each occurs. It reveals the variation of sexual behavior for given indi¬ 
viduals and among different individuals, and it relates the variation to 
age, education, occupation, marital status, religion, and urbanization. 

A statistical study cannot, of course, encompass the whole of what 
we think of as sexual behavior. A definition in terms of overt acts— 
Kinsey’s definition is in terms of activities which ordinarily culminate 
in orgasm [1]—^necessarily excludes aspects that some will describe as 
“the essential nature of sexual activity,” the aspect chosen as essential 
depending on whether the chooser is an artist, biologist, criminologist, 
di*amatist, psychologist, sociologist, theologist, poet, or philosopher, 
and perhaps depending also on his own sexual background. As (Ircgg 
says, “a great mountain may present aspects that are . . . so different 
that bitter disagreements can arise between those who have watched 
the mountain, truly and well, through all the seasons, but each from a 
different quarter” (p. v). Surely one of the significant aspects of human 
sexual behavior is the statistical one: how many do what, and how 
frequently? This is what Kinsey has undertaken to find out. That it is 
not the only thing worth knowing about sexual behavior is no ground 
for criticism; that it is an important thing to know about sexual be¬ 
havior can hardly be denied. 

Tlierc have been, to be sure, other statistical studies of sexual be¬ 
havior. Kinsey describes nineteen (pp. 23-31), some of them excel¬ 
lent [2]. But no other study has been comparable with Kinsey’s in the 
number of individuals inchided or in the amount of data about each 
individual. He has collected data from over twelve thousand persons [3], 
and has covered over five hundred items [4] for each. In contrast, 4600 
is the largest number of persons, and 218 the largest number of items, 
covered in any of the other nineteen studies listed; and 2484 and 116 
are fairer comparative figures, for the 4600 persons were studied super¬ 
ficially and the 218 items were coordinated with Kinsey’s. 

It is not by any means true, however, that Kinsey’s conclusions are 
all based on statistical data. A great many assertions or implications 
about religious, ethical, sociological, psychological, and philosophical 
matters are scattered through the book—so many that I got a cumulat- 



einset bepobt 


465 


ing impression that the author is at heart a social reformer [5]. Most of 
his conclusions, explicit or implicit, about social and moral issues are 
based not so much on the statistical data “routinely secured in the in¬ 
terviews” as on “supplementary data” secured by other techniques. 
“Tlicse additional data have come from a considerable list of subjects 
with whom long-time social contacts have been maintained, in some 
cases for as long as seven and eight years. .. . While these supplemen¬ 
tary records have contrii)uted little to the statistical tabulations of 
data, they have provided a considerable portion of the detail... on 
the psychologic and social eoneomitjints of sexual behavior, particularly 
in relation to factors which motivate and control the activities” (p. 
74) [6]. It would appejir, for example, that most of the material dealing 
with the attitudes of various social classes toward sexual techniques, 
their patterns of sexual behavior, and the social implications of class 
variations in sexual iK'havior (pp. 3C3-393) must be based on the sup¬ 
plementary diita. In fact, much of the most interesting, and at the 
same time most controversial, material in the book appears to be based 
on the supplementary data. At least it clearly is not based on the sta¬ 
tistical data [7]. One reason the many conclusions and interpretations 
based on the supplementary data have provoked eontroversy is that 
the data tliemsclvcs an* not piesented. Hie passage on pp. 73-75, from 
which the quotation above was taken, seems to be the only account 
cither of the methods of collection or of the data tliemsclvcs. 

llie book contains thri'c rather distinct types of material. There is 
methodological material, covering procedures for securing subjects, 
metho<ls of obtaining information from subjects, the reliability of the 
data, and methods of statistical analysis. There is statistical data on 
the sexual lichavior of <*ertain groups of white American males. And 
there is “supplementary” intorpn‘tJitive material of a sociological or 
cultural anthropological kind, lliese throe types of material are dis¬ 
tinctly (lifTemit in jK'rtinencc and in scientiiic quality. Only the second 
pertains primarily to the subject matticr described in the book’s title, 
and only the second is a scientiiic report, in the stmsc of an attempt to 
set forth systematically not only conclusions but the evidence on which 
tlie conclusionH arc based and from which other investigators can plan 
independent programs for verification or extension. The methodolc®- 
cal material has some of the aspects of scientific reporting, but it con¬ 
tains in addition many unsubstantiated assertions [8]. The sociological 
material consists essentially of insights and opinions about sex in our 
society. Its character is somewhere between intelligent, technically 
trained, deeply interested, and thoughtful observation, like Barnard's 



466 AMEEICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

work on administrative behavior, and systematic but perspicacious 
anthropological field work, like Eedfield's work on the culture of Cen¬ 
tral American villages [9]. 

The three types of material are neither differentiated nor integrated 
in the volume. The methodological material refers not to the “5300” 
males to whom the statistical results relate, but to all twelve thousand 
individuals so far studied. Conclusions based on the sociological in¬ 
terpretations or the supplementary data are frequently stated along 
with those based on the statistical data, and it is frequently difficult to 
judge what the basis is for a given conclusion. A clearer distinction 
would not only have added scientific stature to the report but would 
have eliminated, or at least focused properly, much of the criticism and 
some of the enthusiasm that has greeted it. 

The book is divided into three parts: (I) History and Method, (II) 
Factors Affecting Sexual Outlet, and (III) Sources of Sexual Outlet. 
While this organization results in some unavoidable duplication be¬ 
tween the second and third parts, it is on the whole worthwhile to be 
able to find all the material on, say, the relation of age to sexual outlet 
brought together in a single chapter of Part II. Similarly, it is an ad¬ 
vantage to have all the material on, say, marital intercourse brought 
together in a single chapter of Part III. 

Wken I first examined the volume, paying attention mostly to its 
fascinating substantive findings and scarcely at all to its methods, I 
was very favorably impressed indeed. When I diverted my attention 
to the general methods I began to note shortcomings; but I felt that 
these were technicalities—^mere blemishes on the surface of the monu¬ 
ment, which might modify some of the findings in detail but surely 
would not affect the broad conclusions. After all, many of Kinsey's 
figures would still be important and interesting even if we had to allow 
for an error factor as large as two or even three. But when I spent some 
time studying the statistical methods in detail, I realized that my con¬ 
fidence in the basic significance of the findings cannot be securely 
buttressed by factual material included in the volume. In fact, it now 
seems to me that the inadequacies in the statistics are such that it is 
impossible to say that the book has much value beyond its role in open¬ 
ing a broad and important field. 

Instead of emphasizing directly the reasons for these misgivings 
about Kinsey's statistical techniques, let me bring them out indirectly 
by describing some of the improvements that I would like to see in 
later volumes. Kinsey plans several more reports, and there is every 
indication in this first one that he welcomes and even seeks criticism 



KINSEY REPORT 


467 


that might help him overcome the numerous imperfections in his work, 
which he is the fii*st to recognize; so we have here, for once, an instance 
where there is really a good case for “constructive” criticism. While 
it is not feasible to make specific detailed recommendations about sta¬ 
tistical work unless one is in close touch with all the practical circum¬ 
stances surrounding it, it is nevertheless possible to suggest tentatively 
directions in which improvement is possible. 

It will be convenient to group my suggestions under three headings, 
relating to the collection, to the presentation, and to the interpretation 
to the data. 


COLLECTION 

I will not discuss the problem of determining the sexual behavior of a 
given subject—^that is, the variables selected, the interviewing tech¬ 
nique, and the list of questions [10]—^but will confine myself to sugges¬ 
tions concerning the selection of individuals. 

The question of sample size ought to be thoroughly reconsidered. 
Kinsey emphasizes that “the chief concern of the . . . study is an un¬ 
derstanding of the sexual behavior of each segment of the population, 
and that it is only secondarily concerned with generalizations for the 
population as a whole” (p. 82). He defines his segments according to 12 
criteria (sex, race, marital status, age, age of adolescence, education, 
occupation, occupation of parent, religion, religiousness, urbanization, 
and geographic residence), each having from 2 to 10 categories. This 
makes 384,912,000 segments if my arithmetic is correct or “nearly two 
billion” (p. 81) if Kinsey’s is [11]; but as Kinsey points out most of 
these are fortunately non-existent or rare, and actually only 163 (p. 
29)—^an impressive enough figure—^are covered by the data in the 
book. Over 40 pages is devoted to the problem of how large a sample 
is necessary in each segment, and the conclusion is “that a sample of 50 
has proven adequate for establishing incidence data, that samples of 
100 or 200 are fairly adequate for means and medians, and that samples 
of 300 or more are quite adequate for determining means and medi¬ 
ans . . . but. . . smaller samples may still be taken as indicative of 
results that may be obtained from larger samples” (p. 683). Unfortu¬ 
nately, however, the discussion of necessary sample size is unadulter¬ 
ated nonsense [12], and represents a prodigious waste of effort. 

A proper determination of sample size for Kinsey’s material will not 
be simple, though it should be much easier now that he has collected 
twelve thousand histories than it would have been earlier. The desirable 
sample size depends on the requisite accuracy of the results, for one 



468 AMEBICAN BTATIBTICAIi ABSOCIATIOK JOUBNAL, DECEMBEB 1M9 

thing. To specify accuracy, as IQnsey does (pp. 83-85, 736), simply by 
some arbitrary percentage of the true figure is ambiguous for propor¬ 
tions and may be impracticable for averages, so even the specification 
needs further consideration in later work. It may be that a double¬ 
sampling procedure would be appropriate; perhaps the sequential esti¬ 
mation procedure recently developed by Charles Stein would be of real 
value [13]. 

In considering sample size for future work, account must be taken 
of the fact that many of Eonsey’s cases represent the same individuals, 
and are therefore not independent [10]. Another technical difiiculty in 
detemuning sample size is that some of ESnsey’s sampling is cluster 
sampling—^that is, he selects certain “groups” such as sororities, hitch¬ 
hikers, or mental institutions, and interviews all members of the 
group—so the individual histories in his sample may not be independ¬ 
ent. 

I mention these difficulties not to imply that Ejnsey should not treat 
each history as many observations or that he should not use cluster 
sampling (though I do feel that both techniques require further analy¬ 
sis [14]) but only to indicate that a proper determination of adequate 
sample sizes is not an entirely simple problem. A further complication is 
that multiple measurements are involved in each observation, but this 
is characteristic of sample-size problems. 

Conader next the composition of the sample. Eansey states that it is 
“valid to extend generalizations” from his samples to the “163 groups 
on which data are ^ven” (p. 29). Now a sine qua non for generalizing 
from a sample to a population is randomness. But, as Kinsey pcwts out, 
in a hiunan survey it is impossible to define a population clearly and 
then produce a sample that can be guaranteed random (pp. 92-93). On 
the other hand, any batch of data that we do get is a random sample 
from some population; that is to say, indefinitely many repetitions of 
the procedure that produced the sample will produce a population. We 
have two handles to manipulate in generalizing from human samples: 
one is to define our population as closely as we can and then attempt 
to approximate randomness in our sampling procedure; the other is to 
analyze our actual sampling procedure as well as we can and then at¬ 
tempt to describe the actual population to which it relates [15]. 

In my judgment, Kinsey in his future work should devote a sub¬ 
stantially larger part of his resources—^which I realize are limited—^to 
attempting to get a good grasp on one or both of these handles. On the 
basis of this first report, I believe that far more can be done. With re¬ 
spect to actually defining his populations and then sampling them at 



KINSEY REPORT 


random, the following remark is suggestive: “we have a network of con¬ 
nections that could put us into almost any group with which we wished 
to work, anywhere in the country” (p. 39). This network together wih 
the wide publicity that the book has received, may reduce the necessity 
of spending “days and weeks and even some years ... in acquiring the 
first acquaintances in a community” (p. 39). 

But, whatever success may be attained in this randomization surely 
will fail short of perfection. At least it falls short of perfection in such 
relatively simple matters as marketir^ surveys, income studies, and 
even presidential polls. So every step in the sampling process should be 
paralleled by two steps aimed at studyii^ the sampling process. How 
many refusals are encountered? How does the refusal rate vary among 
segments? What are the determinable characteristics of refusers in the 
various segments? It is said that “the restrained histories have, on the 
whole, been the more difficult to get” (p. 103). What is the evidence? 
How much harder? Are there trends in the histories with respect to re¬ 
straint or other characteristics? It is said that many of the subjects 
have cooperated in order “to obtain information about some item af¬ 
fecting their personal lives, their marriages, their families, friends, or 
social relations” (p. 37). Are records kept of these questions, and if so 
are the questions related to patterns of sexual behavior? It is said that 
“the greatly disturbed type of person who goes to psychiatric clinics 
has been relatively rare in our sample ”(p. 37). What is the evidence? 
What similar information is there about the personality types included 
in the sample, and what more can be obtained? Bare relative to what? 

I should judge that nearly half the total cost of analysis for a project 
such as this would be in checking on the population-sample relationship; 
but it is worth it, for the vali^ty and usefulness of the research de¬ 
pends fully as much on this as on the soundness of the actual measure¬ 
ments. 

A final remark with respect to data collection: In future work it is 
de^rable to plan carefully in advance the methodological checks that 
are to be included, as for example retakes, comparison of spouses, com¬ 
parison of interviewers, comparison of remote and immediate recall, 
comparison of earlier with later results by the same interviewer, analy- 
^ of intra-cluster correlation, etc. If in these comparisons the factors 
not directly involved were balanced out, the results would be sub¬ 
stantially more meaningful than are those presented in the book. 
Furthermore, it is well to avoid comparisons in which, as in the in¬ 
vestigation of the clusters (pp. 93-102) or the comparison of age 
groups (pp. 198, 200) one sample is compared with a second which 



470 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

includes the first; instead, the first should be compared with the re¬ 
mainder. Application of the statistical principles of experimental design 
would make it possible to carry out the methodological checks both 
more effectively and more economically. 

PRESENTATION 

My strongest recommendation about future reports is that they 
should tell precisely what was done. My strongest complaint against 
the present volume is that when I study it in any detail I frequently 
cannot tell what information Kinsey’s conclusions are really based on. 

In the first place, I hope that the next volume (or at least a mimeo¬ 
graphed supplement to it) will give an account of the actual questions 
used. There are admittedly difficulties in doing this, for the questions 
are numerous and “have never been standardized” (p. 61) because 
“the form of each question has varied for the various social levels and 
for the various types of persons with whom the study has dealt.” 
For example, “sexual vernaculars must be used in interviewing lower 
level individuals” and “such vernaculars vary considerably among 
different groups” (p. 52). Nevertheless, the point which each question 
covers is said to be “strictly defined” (p. 51) and terms such as “pet¬ 
ting” and “prostitute” are precisely explained to the subjects. The 
interpretation of nearly everything in a study like this depends upon 
the questions, and their absence makes it impossible for other workers 
independently to test or to supplement Kinsey’s data. 

In the second place, future reports should show the actual number 
of histories involved and how they are distributed with respect to all 
the controls—age, education, occupation, religion, etc. [15] As a matter 
of fact, I am not even sure how many white males are covered by the 
present volume [3]. Furthermore, every effort should be made to 
present the basic data, at least in a supplementary monograph. Such 
a presentation should show for each question and each segment the 
responses given. It is not too late to publish such a supplement to the 
present volume; this would require a table of 163 columns and perhaps 
300 rows, and could probably be presented in less than a hundred 
pages. 

Any information given on the basis of all histories so far collected, 
as in methodological tests, should also give separately the correspond¬ 
ing figures for the histories covered in the specific report. Obviously 
statements that are true of twelve thousand cases may not be true at 
all of a specific set of 5300. 

In general, I found the tables apparently meaningful while reading 



KINSEY REPORT 


471 


the book for its substantive content; but when I studied them I found 
many of them confusing and quite a few downright unintelligible 
[16]. This is intolerable in a statistical report, but it can be avoided 
easily in future volumes by competent technical editorial work. 

The explanations of statistical techniques struck me as thoroughly 
unclear. The explanation of the Accumulative Incidence Curves, de¬ 
scribed as “the one new statistical tool which we have had to develop 
for this study” (p. 114) and its superiority over the ogive, would have 
left me bewildered had I not come across essentially the same idea 
clearly explained in the June 1946 Journal of the American Statistical 
AssociaMon in an article on “The Operating Life of B-29 Engines” by 
Oscar Altman and Charles Goor, who casually refer to the technique 
as a standard actuarial device. And Kinsey’s formula for the median 
(p. 113) is just wrong [17]. Again, these things are intolerable in a 
statistical report; but they can surely be remedied in future work. 

As for the so-called “U. S. Corrections” (pp. 105-109), I have been 
unable to understand them. The explanation sounds straightforward, 
but a few calculations I have made with them do not check with Kin¬ 
sey’s [18]. The U. S. Corrections are obviously intended as a set of 
weights for combining various of the 163 segments into averages ap¬ 
plicable to broader groups, and of course some such set of weights is 
needed. 


INTERPRETATION 

Most of what I have to suggest under this heading has been covered 
by implication under the “collection” heading. In general, if a statistical 
investigation- of this kind is well planned and the data properly col¬ 
lected the interpretation will pretty well take care of itself. So-called 
“high-powered,” “refined,” or “elaborate” statistical techniques are 
generally called for when the data are crude and inadequate—exactly 
the opposite, if I may be permitted an obiter dictum, of what crude and 
inadequate statisticians usually think. 

Kinsey’s data have very properly been subjected to a minimum of 
processing. His measurements of incidence, frequencies, and accumu¬ 
lative incidence are straightforward and sound, and no doubt his U. S. 
Corrections are too, if I only understood them. While he smooths most 
of his curves, he always shows the original data too [10]. All of this I 
hope to see continued in later volumes. 

The only measure which is seriously mishandled is the range. This is 
a statistic which can be interpreted only if one takes into account the 
number of observations, makes stringent assumptions about the 



472 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

normality or other mathematical characteristics of the population and 
about the independence of the observations, and has at hand an ap¬ 
propriate set of tables relating sample range to population variability. 
If the variability in several populations is constant, the range in 
samples will be larger the larger the sample. Kinsey falls into a fallacy 
when he shows (p. 234) the next-to-highest frequency of outlet found 
at various ages and interprets the marked decline with age to indicate 
that variability declines with age. It would not be surprising if it does 
in fact decline with age, but the table does not show it without further 
interpretation, for the number of cases declines from 3,012 in the 
youngest group to 68 in the oldest. Sample ranges should not be 
shown in future studies but some other measure of variability should 
be substituted, for example the standard deviation, the average devia¬ 
tion, or—probably more practicable—^the distance between the first 
and ninth deciles. 

It would be desirable in further work to distinguish between dif¬ 
ferences that are not statistically significant—^in other words, that 
aren’t there as far as the data show—and those that are statistically 
significant but of little consequence. In several instances the present 
book disregards differences which the data establish, but which if ad¬ 
mitted to exist could be shown to be small enough to be negligible for 
many purposes. For example, different interviewers seem to get sig¬ 
nificantly different results (as is characteristic of many measuring de¬ 
vices), but the differences are apparently not great enough to affect the 
conclusions seriously (pp. 133-143). 

Terman’s criticisms of Kinsey’s inerpretation of the data, and also 
his criticisms of other aspects of the work, all seem to me entirely sound 
[19], Terman maintains convincingly that Kinsey has misinterpreted 
his data when, for example, he concludes that “the sexual patterns of 
the yoimger generation are . . . nearly identical with the sexual pat¬ 
terns of the older generation in regard to .. . many types of sexual 
activity” (p. 397), and when he asserts that sexual patterns are stable 
throughout life, assuming in childhood the pattern of the occupational 
group to which the individual will ultimately move (p. 419). But it is 
hard to make specific recommendations for avoiding this kind of mis¬ 
interpretation, except to reiterate that a substantially better quality of 
statistical work is essential if future research is to have value. 

After studying the volume I still feel that it is a pioneering and 
monumental work in an important field. And of the authors’ virtues 
as listed by Dr. Gregg, I still admire their tenacity, their tolerance, 
their social skills, and their courage; but I have major reservations 



KINSEY BEPOBT 


473 


about their analytical competence insofar as that means statistical 
competence. The work will take a place in the history of this subject, 
it seems to me, analogous to that occupied in the history of price studies 
by Thorold Rogers' seven volumes on the History of Agriculture and 
Prices in England^ 1259-1798^ of which the Encyclopedia of the Social 
Sciences says, “Even his severest critics express admiration of his 
scholarly labors in extracting prices from such sources as the bailiff's 
accounts of the property held by the colleges of Oxford and Cambridge 
and the great monastic corporations of the Middle Ages. It is the 
interpretation of these figures . .. which is open to question and cor¬ 
rection." “From the point of view of modern statistical technique these 
studies leave much to be desired. They have been modified and sup¬ 
plemented by recent works, especially those of Beveridge, Usher, and 
Hamilton, so that it is now possible to trace with some assurance the 
general trend of prices in western Europe through the interesting period 
of the ^price revolution'" [20]. As a consequence of Kinsey's labors it 
will no doubt be possible at some future date to describe with some 
assurance the statistical pattern of sexual behavior. 

NOTES 

[1] The following passage comes as close to defining sexual activity as any I 
have found in the book: "The sexual activity of an individual may involve a 
variety of experiences, a portion of which may culminate in the event which 
is known as orgasm or sexual climax. There are six chief sources of sexual 
climax. There is self stimulation (masturbation), nocturnal dreaming to the 
point of climax, heterosexual petting to climax (without intercourse), true 
heterosexual intercourse, homosexual intercourse, and contact with animals 
of other species. There are still other possible sources of orgasm, but they 
are rare and never constitute a significant fraction of the outlet for any large 
segment of the population” (p. 157). "Outlet,” as used in the report, seems 
to be equivalent to "orgasms” (pp. 193,683). The six forms of behavior listed 
constitute the sexual behavior studied in the report; two measurements are 
presented for each type: (1) incidence, the proportion of individuals engag¬ 
ing in the activity, and (2) frequency, the number of orgasms per week from 
the activity. 

[2] Kinsey tends to criticize other studies on grounds that sometimes amount to 
nothing more than that their methods differ from his. For example, he as¬ 
serts that Terman’s data on 1,242 married couples "would have been more 
reliable if they had been obtained by direct interviewing” (p. 31), though he 
neither gives nor cites evidence bearing on the issue, either in discussing 
Terman’s and other questionnaire studies (p. 31) or in discussing his own 
technique of direct interviewing (Chap. 2). On p. 11 he does assert that 
"during the first year the value of personal interviewing as opposed to the 
questionnaire technique was subjected to some testing, ” but 1 have found no 
other reference to this testing or its results. Again, on p. 42 it is asserted, in 



474 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

effect, that "things ... can be done in a person-to-person, guided inter¬ 
view . .. that can never be done through a written questionnaire, or even 
through a directed interview in which the questions are formalized and the 
confines of the investigation strictly limited.” Despite the lack of supporting 
evidence, this seems plausible to me; but so would it if its meaning were 
reversed by interchanging "person-to-person guided interviews” with "writ¬ 
ten questionnaire” or with “directed interview,” and so does the following, 
from p. 62: "Whether the techniques which have been used in the present 
study would be equally effective with other persons engaged in studying 
other problems, is a question which must be answered empirically by each 
investigator in connection with his own special problems.” 

[3] The number of individuals involved in the study is given on p. 10 as 12,214. 
On p. 5, however, there is an outline map of the United States with the 
legend “Sources of histories. One dot represents 50 cases.” The map contains 
427 dots, so presumably represents 21,350 cases. Even if for each state one 
of the dots represents fewer than fifty cases, 427 dots would represent at 
least 19,000 cases. Kinsey uses "histories” and “cases” with different mean¬ 
ings, so their apparent equivalence on this map is confusing: “history” refers 
to the data for one individual, and “case” to the data for one individual dur¬ 
ing one five-year period of his life. The largest number that I have seen men¬ 
tioned in the book as the total number of cases is 14,084 (p. 220). These are 
presumably derived from the 5,300 white males to whom the data in the 
book are said to relate (pp. vii, 6). The ratio of cases to histories for the white 
males thus appears to be 14,084:5,300 or 2.66. Application of this ratio to 
all 12,214 histories suggests over 32,000 cases, which is as much too high as 
the number of histories is too low to account for the 427 dots on the map. 

Actual, it is not quite clear that the total number of histories is 12,214, 
as stated on p. 10. The histories are said to include 5,300 white males, about 
1,000 non-white males (p. 6), and 5,800 females (p. 29). The discrepancy of 
114 is presumably due to rounding to hundreds, although the total number 
of males is given at least once as "about 6,300” (p. 6) and at least once as 
"6,200” (p. 29). How'ever, 12,214 is the total number shown both in the ta¬ 
ble distributing histories by year of collection (p. 10) and in the one dis¬ 
tributing them by interviewers (p. 11). 

That 5,300 is the number of “white males who have provided the data for 
the present publication” (p. 6) is not confirmed by any of the statistical ta¬ 
bles in the book. The largest total 1 have noticed (often the totals are not 
shown but have to be computed) in any of the tables that appear to cover 
all of the white males is the 4,120 shown distributed by religion in Tabic 41, 
p. 208. This same table shows 4,940 males distributed by occupation, but 
since the adjacent column which is said to distribute 179 males by occupa¬ 
tion totals 237, it may be that some individuals are classified under more 
than one occupation. Indeed, it may be that the 4,120 include some classified 
imder more than one religion, for 4,102 is the number in the same table 
classified by education, and 4,069 is the number classified by age at onset of 
adolescence. Tables 37 and 41 both include what appears to be the same 
distribution by age at onset of adolescence, but the frequencies differ: Table 
37 includes 521 more males (about one-eighth more) than Table 41, and 
these additional 521 have a distinctly different distiibution by age of adolcs- 



KINSEY EEPORT 


475 


cence than the 4,069, relatively more (37.0 per cent instead of 30.8 per cent) 
being below 13 years and relatively fewer (30.7 per cent instead of 36.3 per 
cent) above 13. Table 67, p. 298, shows 4,606 males distributed by age at 
onset of adolescence through 16 years, which is 43 or 44 more than shown in 
Table 37 for the same ages; adding to these 4,606 the 28 shown in Table 37 
as reaching adolescence after 16 would give a total of 4,634 males. Table 36, 
p. 186 shows only 3,730 cases distiibuted by school grade at adolescence, and 
Table 35, p. 184, shows totals ranging from 1,355 to 3,573 cases for five dis¬ 
tributions by age of occurrence of various developments in adolescence. The 
number for whom information is not available on a given item is never 
shown. In general, very little is revealed in the statistical data about the 
number of males covered in the volume. 

The following “definition” of cases is said to “have been applied . . . 
throughout the present volume” (p. 682): “Coses. Showing the size of the 
population on which the data in the tables are based” (p. 683). Table 44, 
p. 220, shows 14,084 cases distributed by age from adolescence to 85, of 
whom 11,467 are 30 or under. Table 40, p. 198, shows 14,083 as the number 
of cases for “all ages, adolescence to 85,” also with 11,467 of them 30 or 
under. Table 41, p. 208, however, shows the number of cases from adoles¬ 
cence through 30 as 11,985; so perhaps 518 cases should be added to the 
14,084 in trying to discover the total number of cases involved in the study. 
Tables 104 and 105 (pp. 410 and 412) both show 13,359 as the total number 
of cases, 11,314 single and 2,045 married; 9,286 aged under 33 and 4,073 
aged 33 or over. (Incidentally, this is one of the few pairs of tables I have 
had occasion to compare that have not proved inconsistent in their totals; 
the pair on pages 10 and 11 is another exception, and Tables 40 and 44 miss 
by only one case. Tables 104 and 105, though consistent with one another, 
show fewer cases 32 years of age and under than are shown as 30 and under 
in Tables 40, 41, and 44.) Tables 152-154, pp. 686-734, the so-called “Clini¬ 
cal Tables,” include 15,746 cases (11,725 single, 3,275 married and 746 pre¬ 
viously married), according to my addition. The numbers of cases shown in 
these clinical tables are hard to reconcile with one another, however, for the 
sum of the numbers shown in various subdivisions sometimes exceeds, and 
sometimes is exceeded by, the number shown for the whole group. For ex¬ 
ample, the data on pp. 688-690 for single white males, age group adolescence 
through 15 years, educational level 13-f, show the whole group as 2,799 
cases; but the sum of corresponding figures for the urban (2,587) and the 
rural (352) subdivisions of the group is 2,939, and the sum of the correspond¬ 
ing figures for the six religious subdivisions of the same group is 2,974. Such 
a result, where the whole is less than the sum of its parts, could occur if, for 
example, an individual who shifted from one subdivision to another were 
classified in both for the five-year interval in which the shift occurred— 
though this practice would be unsound. On the other hand, the whole fre¬ 
quently turns out to exceed the sum of its parts; for example, for the same 
age, color, and marital status just discussed but educational level 8-12, the 
total number of cases for the whole group is 606, but the urban (459) and 
rural (124) figures total only 583. 

The term “population” in the definition of cases quoted above seems ac¬ 
tually to mean “sample. ” The rest of the paragraph of which the quotation is 



476 


AMEMCAN statistical association JOUENAL, DECBMBEE 1949 

the first sentence deals with the adequacy of samples of various sizes. In 
general, Kinsey seems to be aware of the distinction between the statistical 
concepts of “sample” and “population” and of the fundamental importance 
of the distinction (see, for example, the section on “The Taxonomic Ap¬ 
proach,” pp. 16-21), but he misuses the terms in a way that is frequently 
disconcerting. For example, the heading “population in sample” appears 
in vaiious tables (e.g., p. 1^) and “sample population” in others (p. 188). 
Sometimes the phrases “sample population” and “U. S. population” (p. 188) 
seem to represent the sample-popidation distinction. The phrase “the whole 
population involved in the present study” (p. 194) seems to mean “the whole 
sample.” 

“Educational level 13+” is referred to above in discussing the numbers 
of cases shown in Tables 152-154. To Kinsey “13 H-” means “ultimately 
more than 12 years,” i.e., that at least a start has been or ultimately will he 
made to college. In using the tables as a basis for comparison of any particu¬ 
lar male, it is necessary either to exclude “persons who are still in school, 
since there is no certainty how far they will go before they finally terminate 
their education” (p. 331), or to “predict, on the basis of his home back¬ 
ground, the amount of his future schooling” (p. 682); and presumably simi¬ 
le requirements apply to those who have left school before reaching the 
highest educational level and who might later resume their education. 

It is doubtful that the age division referred to above in connection with 
Tables 104 and 105, and also involved in Tables 98-103, is between “persons 
who were 33 years of age or older at the time they contributed their histories” 
and “persons who were younger than 33 at the time of contributing,” as 
stated on p. 395, since Tables 98-100 include data on the behavior of the 
younger group at age 33. Perhaps the younger group was actually 33 and 
under and the older group over 33. 

[4] Five hundred and twenty-one is said to be the number of “items which are 
systenmtically covered on each of the histories in the present study” (p. 32), 
but this is perhaps an exaggeration, partly because the count depends on 
what is considered an item and what is considered merely one of several 
possible responses for an item, and partly because for any individual many 
of the items are inapplicable. 

The list of “Items Covered on Sex Histories” (pp. 63-70) shows nine ma¬ 
jor groupings under which are a total of 71 numbered headings. The items 
hsted under some headings call for distinct pieces of information; for ex¬ 
ample, under “educational history” are listed years of schooling, colleges 
attended, college majors, age upon leaving school, and age while in high 
school. The items listed under other headings, on the other hand, constitute 
little more than a list of possible answers to a single question; for example, 
under “recreational interests” are listed, among others, moving pictures, 
dancing, cards, hunting, fishing, reading, sewing, music, sports. There is no 
way to tell whether the claim of 521 includes items like those under recrea¬ 
tion on a par with those under education. 

The “actual number of items covered in each case” is described on p. 63 
as “usually nearer 300, and the number involved in the histories of younger 
and less experienced individuals is often less than that.” This suggests that 
300 or less is the number of items involved for any one person. The following 



kinsbt repokt 


477 


passage from p. 50 suggests, however, that 300 or more is the number for each 
individual: “On each history in the present study there has been a sys¬ 
tematic coverage of a basic minimum of about 300 items. This minimum is 
expanded for persons who have extended experience. . . . The maximum his¬ 
tory covers 521 items.” On the other hand, it appears from p. 51 that numer¬ 
ous items beyond the list of 521 may be covered in some histories, e.g., males 
with elaborate techniques of masturbation, individuals who have had some 
complex relation with their parents, identical twins, highly intelligent in¬ 
dividuals with considerable experience in a socially taboo type of behavior, 
indisriduals involved in masochism or sadism, persons who are handicapped, 
have lived in foreign countries, have had experience in military groups, etc. 
“As scientific explorers, we, in the present study, have been unlimited in our 
search to find out what people do sexually” (p. 51). 

Kinsey’s compaiison of his study with nineteen others in regard to num¬ 
ber of items covered is a little misleading (pp. 28-29). For example, Ram¬ 
sey’s 218 items are described as 41.9 per cent as many as Kinsey’s; but for 
boys from Junior High Schools, Y.M.C.A.’s, and Boys’ Clubs, the group 
covered by Ramsey, Kinsey’s list of relevant questions may be not much 
larger than 218—certainly it is not nearly two and one half times as large, 
as is implied by the 41.9 per cent figure. Thus what at first appears to be a 
difference in the amount of information for each individual is at least partly 
only a reflection of the difference in the number and variety of individuals 
covered. In fact, we learn on p. 30 that Ramsey’s work was “based on per¬ 
sonal interviews which were coordinated with the list of questions and the 
techniques of the present study.” 

[5] If we accept the implications of Kinsey’s assertions (p. 199) that “there 
is, inevitably, some correlation between these rates [orgasms per week] and 
the positions which these persons take in a public debate” on sex instruction 
and administrative policies of educational institutions, that “the policies 
that ultimately come out of such meetings [on juvenile delinquency, law en¬ 
forcement, and sex laws] would reflect the attitudes and sexual experience 
of the most vocal members of the group, rather than an intelligently thought- 
out program established on objectively accumulated data,” and that “often 
the conclusions [of scientific discussions of sex] are limited by the personal 
experience of the author,” then many of his own conclusions and implica¬ 
tions—especially those not based on the statistical data—^require for their 
evaluation some indication of where he himself falls in his various distribu¬ 
tions. Acceptance of this line of argument might “explain away” many of 
Kinsey’s social and ethical judgments, as, for example, if he were one of the 
six persons described on p. 217 as having the highest long-time averages. 

[6] The first omission from this quotation describes in more detail the social 
contacts used to collect the supplementary data: visits to the subjects’ 
homes; visits with them to their friends’ homes, theatres, concerts, night 
clubs, taverns, and other places of recreation; correspondence; records of 
their sexual activities; photographs of their drawings; transciipts of their 
court, institutional, or social agency records. This impressive account of 
Kinsey’s own collection of supplementary data should be contrasted with the 
following: “Sometimes social scientists hobnob as tourists in some social 
milieu sufficiently removed from their own to make it possible for them to 



478 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 


acquire 'impressions’ and 'hunches’ about ‘social patterns’ and 'motivations 
of behavior’ in whole cultures. .. . The day seems overdue when scientists 
studying human material will forsake barbershop techniques ...” (p. 19). 
This brings to mind Alfred Marshall’s “general rule that in discussions on 
method and scope, a man is nearly sure to be right when affirming the use¬ 
fulness of his own procedure, and wrong when denying that of others” {Prin¬ 
ciples of Economics, Eighth Edition, p. 771). Another example is provided 
by comparing the passage on p. 201 objecting to conclusions about sexual 
behavior based on persons who come to a clinic, with the passage on p. 37 
asserting that while many of Kinsey’s subjects were obtained because of 
public knowledge of the project as a source of help with personal sexual 
problems, these were the everyday sexual problems of the average individ¬ 
ual. 

[7] Actually, it is hard to see how even the supplementary data could support 
such apparently quantitative assertions as “most of the tragedies that de¬ 
velop out of sexual activities are products of this conflict between the atti¬ 
tudes of different social levels” (p. 385), or “most of the complications which 
are observable in sexual histories are the result of society’s reactions when it 
obtains knowledge of an individual’s beharior, or the individual’s fear of 
how society would react if he were discovered” (p. 202). 

[8] hlany of Kinsey’s methodological assertions are of considerable interest, so 
it is regrettable that he has not indicated their basis. Examples are; “Appre¬ 
ciation must be sincere, else it will not work” (p. 37). “The underworld re¬ 
quires only a gesture of honest friendship before it is ready to admit one as a 
friend, and to give histories 'because you are my friend’ ” (p. 36). “The ex¬ 
perienced interviewer knows when he has established a sufficient rapport 
to obtain an honest record, in the same way that the subject knows that he 
can give that honest record to the interviewer” (p. 43). “People understand 
each other when they look directly at each other” (p. 48). “We attempted to 
follow standard practice [of making records only after the subject has left at 
the close of an interview] early in this study and found that it introduced a 
tremendous error into the records” (p. 50). “When one is dealing with such 
a socially involved question as sex it becomes particularly important to ask 
direct questions. . . . Euphemisms should not be used ...” (p. 53). “ . . . 
we always begin by asking when they first engaged in such activity. ... It 
might be thought that this approach would bias the answer, but there is no 
indication that vre get false admissions . .. ” (p. 53). “Looking an individual 
squarely in the eye, and firing questions at him with maximum speed, are 
two of the best guarantees against exaggeration” (p. 54). Also, see Notes 2 
and 15. 

The correctness of these assertions is not in question here. They are cited 
simply as illustrations of propositions asserted without the evidence neces¬ 
sary to enable another investigator to evaluate them. In this sense they dif¬ 
fer from Kinsey’s conclusions about male sexual behavior, insofar as these 
are accompanied by statistical data whose source is explained. 

[9] See Chester I. Barnard, The Functions of the Executive (Harvard University 
Press, 1938, pp. xvi-|-334) and Organization and Management (Harvard Uni¬ 
versity Press, 1948, pp. xi-1-244); Robert Redfield, Tepoztlan, A Mexican 
Village: A Study of Folk Life (University of Chicago Press, 1930, pp. xi 



KINSEY REPORT 479 

+247) and The Folk Culture of Yucatan (University of Chicago Press, 1941, 
pp. xxiii+416). 

In quality, as contrasted with type, Kinsey’s work is not comparable with 
Barnard’s or Redheld’s. Barnard, for example, seems to have a deeper un¬ 
derstanding of the inconsistency between legalistic statements of the prin¬ 
ciples of administrative authority in a social organization and actual 
behavior in the same organization, than Kinsey does of the inconsistency 
between our “publicly pretended code of morals” (p. 197) or our “socially 
pretended custom” (p. 203) and his finding that the persons involved in 
“illicit activities, each performance of which is punishable as a crime under 
the law, . .. constitute more than 95 per cent of the total male population” 
(p. 392). 

Incidentally, this 95 per cent figure needs to be interpreted with more 
care than Kinsey uses. It means that 95 per cent of all white males either 
have engaged at least once in their lives in an “illicit” activity or can be ex¬ 
pected to engage in one at least once, if they live to be 85. (Actually the same 
figure results if wo assume only that they will live to be 45.) Several of the 
statements Kinsey makes in connection with this 95 per cent figure imply 
that 95 per cent of the total male population has engaged in illicit activities. 
This is logically equivalent to interpreting “all males now living will ulti¬ 
mately die” to mean “all males now living have already died.” 

No source is cited for the figure 95 per cent and it cannot be verified from 
the statistical data given in the book. Even if it is correct and were correctly 
interpreted, to conclude from it that “only a relatively small proportion of 
the males who are sent to penal institutions for sex offenses have been in¬ 
volved in behavior which is materially different from the behavior of most 
of the males in the population” (p. 392) would be a non-sequitur. As Ter- 
man says, “it is as though one said that if 95 per cent of all males have at 
some time in their lives stolen something, those who are sent to penal insti¬ 
tutions for theft or burglaiy are not materially different from most males in 
the population.” (Lewis M. Terman, “Kinsey’s ‘Sexual Behavior in the 
Human Male’: Some Comments and Criticisms,” Psychological Bulletin^ vol. 
45, 1948, pp. 443-459; the quotation is from p. 456.) 

Kinsey makes a similar statement about adolescents: “On a specific cal¬ 
culation of our data, it may be stated that at least 85 per cent of the younger 
male population could be convicted as sex offenders if law enforcement of¬ 
ficials were as efficient as most people expect them to be. The stray boy who 
is caught and brought before a court may not be different from most of his 
fellows, but the public, not knowing of the near universality of adolescent 
sexual activity, heaps the penalty for the whole group upon the shoulders of 
the one boy who happens to be apprehended” (p. 221). 

[10] Terman, in the review cited in Note 9, makes a number of sound criticisms 
of the methods of obtaining data from a given subject. One that must receive 
attention in any consideration of Kinsey’s statistics is his use of data based 
on long-distance memory. Each individual is asked about his activities as 
far back as he can recall, and his reports are tabulated by five year age inter¬ 
vals. Thus each individual provides several cases. “In the computation of 
mean frequency of masturbation at age 15, for example, the memory report 
of a 50-year-old counts as heavily as the report of a 15-year-old” (Terman, 



480 AMEBICAN STATISTICAL ASSOCIATION JOT7BNAL, DECEMBER ig49 

p. 446). One important consequence of this, aside from inaccuracies that it 
may introduce, is that many of the "cases” on which the statistical analysis 
is based are not independent. This might explain, for example, at least in 
part, Kinsey's finding that patterns of sexual behavior are remarkably con¬ 
stant throughout life, assuming in early childhood the pattern of the occu¬ 
pational and educational level to which the individual ultimately will be¬ 
long (p. 419). It also invalidates, at least partially, Kinsey's notion that 
"smooth trends in such curves are evidence of their approach to reality” 
(p. 132), for most of the smooth trends shown are age trends and therefore 
involve many of the same individuals in the successive age groups. In fact 
most of the charts to which Kinsey has added smoothed curves are accumu¬ 
lative incidence curves; not only are the successive age groups not inde¬ 
pendent here, but the calculations for any age group involve cumulating 
cases over other age groups. 

Incidentally, the fact that some 62 of the 173 charts contain smoothed 
curves is surprising in view of the statement that "All of the frequency 
curves in this volume are based on the actual calculations, and in no in¬ 
stance have they been smoothed by any process or approximated by inter¬ 
polations or other sorts of estimates or predictions” (p. 111). Most of the 
62 smoothed curves are accumulative incidence curves, which show per cent 
of total population having had a specified type of experience—a sort of 
cumulative frequency adjusted for exposure-to-risk—^plotted against age, so 
perhaps are not regarded as "frequency” curves. Figures 36 and 37, how¬ 
ever, both show frequency data that have been smoothed by cumulating and 
further by use of a curve. Personally, however, I have no complaint against 
the smoothed curves, since all charts show the actual observations clearly. 
The simple frequency polygons, showing per cent of cases in various class 
intervals by average number of orgasms per week, are nearly all too erratic 
to permit of useful smooth curves; they thus fail to provide whatever con¬ 
fidence in the data would be lent by smoothness. 

[11] The criteria by which cases are classified and their numbers of categories are: 
sex, 2; race-cultural group, 11; marital status, 3; age, 18; age at adolescence, 
6; educational level, 9; occupational class, 10; occupational class of parent, 
10; rural-urban background, 5; religious group, 3; religious adherence, 4; 
geographic origin, unspecified. The product of the 11 category numbers is 
384,912,000. If Ejnsey intends to represent geographic origin by 48 states, 
which are the only geographic units he mentions, there will be nearly 20 
billion categories. So far, he has made no classifications by geographic oxi- 
^n, but he says "state of residence for the most continuous period of time, 
and the place of residence during the childhood and adolescent years, will 
probably represent the most significant part of the data” (p. 81). It seems to 
me that a few broad regions might suffice; five regions would result in "nearly 
two billion” segments. 

[12] Quinn McNemar cites the following four fallacies in Kinsey’s efforts to de¬ 
termine a proper size of sample: "(1) Failure to recognize the fact that the 
sampling stabilities of means, medians, and modes are not a function of their 
magnitudes, but rather of trait variability.... (2) Failure to consider the 
fact that these statistics differ markedly from each other in their sampling 
errors.... (3) Failure to observe the fact that the sampling stability of 



KINSBT REPORT 


481 


percentages is not a linear function of their magnitudes but rather of their 
degree of remoteness from 50 per cent. ... (4) Failure to note that converg¬ 
ence of sub-sample values to total group values must be more rapid when 
sub-samples are drawn from small (finite) groups than when drawn from 
larger groups. ... In brief, incognizance of four elementary statistical prin¬ 
ciples renders worthless this elaborate effort to determine how large N should 
be for a sub-group” (quoted on pp. 450-451 of the Terman review cited in 
Note 9; the positive part of McNemar’s third point seems to be an imprecise 
allusion to the fact that the sampling variance of a proportion is a linear 
function of the squared difference between the proportion and one-half.) 

Such incompetence on this and other statistical points is surprising in 
view of Kinsey’s statement, widely cited among statisticians, that "the sta¬ 
tistical set-up of the research was originally checked by Dr. Lowell Heed of 
the School of Hygiene and Public Health at The Johns Hopkins University. 
Along list of persons experienced in sampling and in other aspects of statistics 
has been constantly available for consultation” (p. viii). At the Cleveland 
meeting where this paper was presented, Helen M. Walker, who presided, 
read a letter from Reed to Kinsey dated December 10, 1948 which said, in 
part, "I have been troubled at the fiood of criticism that has been leveled 
at your work by the statisticians, mainly because I know that there is value 
in your work, but secondarily because you included my name in the preface 
with the implication that I had some responsibility for the analysis. As you 
of course remember, I saw your work only on the occasion of a two-day visit 
to Bloomington in December, 1942. On the basis of that visit, I joined heart¬ 
ily with the Committee in recommending to the National Research Council 
that support be continued, but a part of the recommendation was that appro¬ 
priate arrangements within the budget should be made to strengthen 
the project from the statistical side. 1 became so busy with work connected 
with the war that I lost all contact with the project and 1 don’t know what 
was done, if anything, to carry out this recommendation. If the type of sta¬ 
tistical guidance had been provided that I had in mind, I feel sure that you 
would now be free of some of the criticism that is now being justly leveled 
at the work.... As you know,... 1 have seen nothing of the work between 
that two-day visit in 1942 and the public appearance of the book.” (Tran¬ 
scribed from recording made at the meeting by Chester I. Bliss.) As for the 
"list of persons experienced in sampling and in other aspects of statistics,” 
it would appear that either they were not consulted or else their advice was 
not followed. 

[13] Stein’s sequential estimation procedure was presented before the Institute 
of Mathematical Statistics at Madison on 10 September 1948, but has not 
otherwise been published. It is a method of determining, to prescribed ac¬ 
curacy with prescribed confidence, the mean of a normal distribution whose 
standard deviation is unknown. On the average, it requires hardly any more 
observations than would be required in standard single sampling if the stand¬ 
ard deviation were known. 

[14] It would be interesting to have the data for the cases in a given age group 
classified by the age of the subject at the time of the interview. Any sys¬ 
tematic variation with age at interview could, however, be interpreted as 
reflecting either memory effects or time trends; but independence between 



482 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

behavior reported for a given age and age at interview would be necessary 
to confirm both Kinsey’s handling of cases and his conclusion that genera- 
tion-to-generation changes are minor (pp. 394-417). Similarly, a study of 
the intra-class correlation in clusters of the kind Kinsey uses would be 
valuable. 

[15] Kinsey’s discussion of his sample is inadequate in two crucial respects: he 
does not explain adequately his procedure of selecting cases, and he does 
not describe adequately the determinable characteristics of the sample that 
he got. 

With respect to sampling procedures, he says little except that random¬ 
ness is difidcult to achieve. He does not suggest that he made any efforts to 
approximate it; instead, he says ^since it is impossible to secure a strictly 
randomized sample, the best substitute is to secure one hundred per cent 
of the persons in each social unit from which the sample is drawn” (p. 93). 
No basis for the assertion is given, but probably the hope is to include not 
only the most willing but also the more reluctant subjects. Whatever gain 
there might be in this respect, however, would be offset at least partially by 
the fact that efforts to get all of a group were not ordinarily made unless 
at least half had already been obtained (p. 95); thus, those groups containing 
relatively many reluctant subjects would be less likely to be the objects of 
hundred per cent drives than those containing relatively few. 

As to the composition of the sample actually secured, it is hard to learn 
much about this; even its size is uncertain (see Note 3). Scattered through 
the book are various scraps of information about special groups that have 
been sampled. Terman, on p. 447 of the review cited in Note 9, lists some of 
these: Of the 62 hundred-per cent groups, 42 were of college level and 7 were 
delinquents or inmates of penal and mental institutions (p. 95.) “Perhaps 
half” of the histories were obtained through contacts resulting from lectures 
(p. 38). Seventeen penal or correction institutions have provided histories 
(p, 15). Five underworld communities and five homosexual communities are 
represented—^they are listed as “social or civic organizations” (p. 16), There 
are data on 1,200 persons convicted of sex offenses (p. 392). In addition, a 
passage on p. 38 strongly suggests to me that “several hundred psychoana¬ 
lysts, psychiatrists, physicians, clinical psychologists, social workers, and 
other professional persons [who] have had an especial interest in observing 
the interviewing techniques” were included; and this interpretation is rein¬ 
forced by the section on “The Confidence of the Record” (pp. 44-47). On pp. 
14r-15 we leain that the sample includes persons who have been students at 
528 colleges, and that 14 of these have contributed 100 or more histories 
apiece; even assuming that each of the 14 has contributed only 100 and that 
each of the other 514 has contributed only one, this accounts for 1,914 his¬ 
tories, or nearly 16 per cent of the total number (12,214). The map on p. 5 
showing the source of the data (see Note 3) has 191, or 45 per cent, of its dots 
m Indiana and the four adjoining states, Illinois, Kentucky, Michigan, and 
Ohio—^five states which contain about 20 per cent of the U. S. population 
(19.4 per cent in 1946). 

Table 41, p, 208, which was discussed in Note 3 above, comes as close as 
any to revealing the characteristics of the sample. It suggests that about 
60 per cent of the histories are college level (in fact, that 26 per cent have 



KINSET REPORT 


483 


training beyond college); that 1.6 per cent are from the underworld, 0.5 per 
cent are business executives, and 61 per cent are white collar or professional 
workers; that 76 per cent are Protestants, 12 per cent Catholics, and 12 per 
cent Jews; that 74 per cent are religiously inactive. What we clearly should 
have, however, is a definite statement of the distribution of the histories 
among the 163 segments for which conclusions are drawn. 

[16] An example of a table that is difiicult to understand is Table 14, “Com¬ 
parisons of data obtained from spouses” (p. 126). The second column in this 
table is headed “items involved” and the third column is headed “unit of 
measurement.” The seventh and eighth columns are headed “mean of hus¬ 
band's reports” and “mean of wife’s reports.” For the item “pre-marital ac¬ 
quaint.” the unit of measurement is “12 mon.”and the means of husband’s 
and wife’s reports are 42.11 and 40.88. For “engagement” the unit is “4 
mon. ” and the means are 12.64 and 12.85. For “lapse, marr.—^first birth” the 
unit is “6 mon.” and the means are 28.05 and 28.19. These figures seem to say 
that the couples in the comparison were, on the average, acquainted 41 to 42 
years before marriage, engaged over 4 years, and married 14 years before 
their first child was born! The data seem more plausible, however, if we as¬ 
sume that the units of measurement are one month in all three cases. The 
“units” given seem only to describe the amount of discrepancy which, if not 
exceeded, is called zero, i.o., identical response for both spouses. The fourth 
column is headed “ident. rspns.%.” 

Tables 152-154 arc puzzling because, as mentioned in Note 3, the num¬ 
bers of cases in subgroups add up sometimes to more and sometimes to less 
than the number of cases shown for the whole group. 

[17] Kinsey says that the formula for the median is (»-|-l)/2 (p. 113). The text 
following this formula, however, gives a correct explanation of the median. 
With lespect to the arithmetic mean, his formula (p. 112) is correct, except 
that he does not define its symbols, but it is followed by three erroneous as¬ 
sertions: first, that “a mean represents the total number of measurements 
... in each group divided by the number of individuals in the group”—^he 
means, of course not the total number but the sum of the measurements, and 
an illustrative example included parenthetically at the omitted part of the 
quotation is correctly handled; second, that “the mean represents the mid¬ 
point of the measurements”—^thc median is the midpoint either in the sense 
that as many observations lie above as below it, and the mean in the sense 
that the sum of i.he deviations is as great above it as below it, but neither 
the moan nor t he median is midway between the extremes; third, that the 
mean’s “position ... is therefore [i.c., because it “represents the midpoints 
of the measurements”] materially affected by the presence of even a few 
high-rating individuals in a population [i.e., sample—see Note 3]; and ... a 
few high-rating individuals affect the means more than a large population 
[i.e. number] of low-rating individuals”—which suggest that Kinsey really 
believes the second of his erroneous assertions. 

Kinsey also asserts that “where most of the individuals in a sample belong 
in a frequency class which is midway between the extremes of the distribu¬ 
tion, and where an equal number of individuals lie in symmetrical distribu¬ 
tion on either side of the midpoint, the mean becomes identical with the 
median” (p. 113). It is not clear whether this means that either of the two 



484 AMEKICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

conditions will bring about the identity, or that loth are necessary to bring 
it about. Actually, the first condition is irrelevant; and the second condition 
is sufficient but not necessary for the identity. 

[18] As a simple illustration of the use of the “U. S. Corrections,” consider Table 
29, “Continuity of pre-adolescent sex play with adolescent activity” (p. 174). 
The first section of this table shows “% with continuity” for three educa¬ 
tional levels, as follows: level 0-8, 77.0; level 9-12, 67.4; level 13-f, 29.8. 
Since the sample sizes (“cases”) for these levels are 243, 221, and 763, the 
per cent for all three levels pooled into a single sample of 1,227 would be 
45.9. Obviously, however, an over-all average should weight the three levels 
not in proportion to their sample sizes as pooling does, but in proportion 
to their population sizes. These population weights are given for 1940 as 
53.21, 35.13, and 10.39, respectively, with 1.2 per cent of the population not 
reporting educational level (Table 11, p. 108). Combining the three ob¬ 
served percentages with these weights gives an over-all percentage of 68.6. 
Kinsey shows 64.9, however. Similarly, for the second and third sections of 
the table my calculations give 60.7 and 44.8, but Kinsey shows 54.7 and 
42.1. In a similar check of the three largest percentages shown in the final 
column of Table 38 (p. 190), 1 find 68.24, 14.42, and 11.29 where Kinsey 
shows 68.39, 12.53, and 13.11. Since Kinsey assures us that “all mathemati¬ 
cal calculations on this project have been performed twice, independently 
by each of two persons” (p. 109), I assume that I have not understood the 
“U. S. Corrections” correctly. 

A curious feature of the “Tables for U. S. Corrections” (pp. 106-108) is 
that the age groups for which the weights are shown are not the same as 
those used throughout the report, but are a year lower, e.g., 15-19 and 20-24 
instead of 16-20 and 21-25. This does not affect the discrepancies mentioned 
in the preceding paragraph, however. 

[19] Terman’s review is cited more specifically in Note 9. Another review which 
makes several sound criticisms of the interpretation is that by Jacob Gold¬ 
stein and Nicholas Pasture, “Sexual Behavior of the American Male: A 
Special Review of the Kinsey Report,” Journal oj Psychology, 26 (1948), pp. 
347-362. Both of these reviews have been helpful in preparing this paper, 
which has also benefited from critical readings by Milton Friedman and by 
L. J. Savage. 

[20] The quotations from the Encyclopaedia of the Social Sdencee are from J. F. 
Rees, “Rogers, James Edwin Thorold (1823-90),” vol. 13, p. 417, and 
Willard L. Thorp and George R. Taylor, “Prices,” vol. 12, p. 377. 



THE CITY BLOCK AS A UNIT FOR RECORDING 
AND ANALYZING URBAN DATA 


Edward B. Olds, Research Director 
Social Planning Council^ SL Louts, Missouri 

Tabulations by city blocks make possible many uses of 
small area data beyond those which can be made from census 
tract tabulations. Block data can be economically analyzed 
and summarized by the use of summary punched cards. Some 
uses of block data are illustrated from the St. Louis experience. 
Suggestions are presented for new census data needed by 
blocks. The publication of a local block map and street direc¬ 
tory facilitates the compilation of new data to supplement 
those obtained from the decennial census. Despite the many 
uses and advantages of block data, they do not replace census 
tract tabulations which meet a somewhat different need. 

T he need for some type of standard unit geographic area for re¬ 
cording data about a city is apparent to many research workers 
and administrators faced with the problem of drawing conclusions 
from statistics about the city. The census tract, popularized by Dr. 
Walter Laidlaw and later by Howard Whipple Green, has served as the 
most generally accepted statistical unit area for American cities. It 
was developed as a compromise device to facilitate the analysis of 
population trends and characteristics in sections of cities. Honest 
attempts have been made to define census tract boundaries so that 
they include territory with reasonably homogeneous characteristics 
and with a population of from 3,000 to 6,000 persons. Unfortunately, 
characteristics change and what were once good boundary lines in 
terms of economic or ethnic indicators are not always good boundary 
lines. To preserve comparability from one census to another, it is 
necessary to keep census tract boundaries intact except for changes in 
city limits. However, within limits, the census tract served to 
reveal gross average differences between major sections of cities, as 
well as trends from one census to another. The census tract has ad¬ 
mittedly an important place in the analysis of urban data, since it is 
small enough to show up differences between major sections of cities, 
and yet large enough to be easily manipulated without considerable 
expense. In large cities, such as New York and Chicago, it has been 
found necessary to combine census tracts into statistical or community 
areas providing more adequate bases for the computation of death 
rates and simplifying the mechanics of presenting and interpreting 
data about sections of the city. 


486 



486 


AMEBICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 


Just as the magnifying glass is not completely replaced in usefulness 
by the microscope, so the usefulness of the census tract is not displaced 
by a more minute and precise unit area. But there are many uses of 
spatially ordered data about the city which can be made when they are 
available by a smaller unit area than the census tract. If it is conceded 
that a smaller unit area is deskable, the question comes up as to how 
much smaller the area should be and how it should be defined. Should 
it be something like a precinct used in organizing elections, a “beat” 
in police circles, or some other arbitrary grouping of city blocks? No 
matter what grouping is adopted, there will always be districts which 
cannot be made to fit the established boundaries. Does this mean that 
the problem is insoluble? It can be easily answered by the adoption 
of the city block as a unit area. Nearly all of the following types of 
districts are composed of city blocks. 


School districts 
Water districts 
Sewer districts 
Health districts 
Precincts and wards 
Police precincts and beats 
Meter reading districts 
Telephone exchange districts 
Power districts 


Campaign solicitation unit areas 

Diocese and parishes 

Census tracts 

Neighborhood areas 

Improvement districts 

Zoning districts 

Fire districts 

Library districts 

Welfare administration districts 


TAX ASSESSMENT DISTRICTS 

Data compiled by blocks with totals recorded in punch cards can be 
economically summarized by almost any of the above districts for 
any city. On the other hand, it is only rarely that data tabulated by 
census tracts or even enumeration districts can be accurately compiled 
according to the above types of districts. Even though an attempt is 
made in establishing census tracts to follow the boundaries of various 
administrative districts, the problem is practically insoluble without 
shifting the boundaries of the districts. 

The proponents of census tracts sometimes argue that if adminis¬ 
trators will not take the trouble to change the boundaries of their dis¬ 
tricts to conform to census tract boundaries, they can do without 
statistical information. Such an attitude fails to take cognizance of 
some of the very real difidculties preventing administrative districts 
from being brought into congruity with the boundaries of census 
tracts. For example, boundaries of school districts may have to be 
altered as there is movement of population, to maintain the proper 
balance of school enroUees in each school. If one district is growing in 



THE CITY BLOCK UNIT 


487 


population while its neighbor is declining, it is obviously simpler to 
move the boundary of the district to correct the unbalanced enroll¬ 
ment situation, rather than to change the capacity of the school. More¬ 
over, for many purposes, it is necessary to have some unit smaller than 
a census tract to serve as an administrative area. For example, police 
beats, precincts, or campaign solicitation areas must be considerably 
smaller than the census tract with a population usually between 3,000 
and 6,000. The investment of sizeable funds in capital equipment such 
as telephone exchanges, power lines, sewer mains or water pipes may 
make it impractical to change boundaries of control areas merely to 
make them conform with artificial statistical areas. Unless accurate 
summaries can be made of the expensively compiled census informa¬ 
tion, many valuable uses are lost. Of course, in some instances it is 
possible to make estimates and approximations by prorating census 
tract data or by using overlay maps. But as business and government 
become more scientific, there is increasing demand for accurate infor¬ 
mation on which to base future plans and policies. By use of the block 
summary punch cards, accurate summaries can be obtained economi¬ 
cally without excessive cost beyond the cost of coding the original data 
in terms of blocks. In relation to the total cost of training enumerators, 
conducting the canvass, designating areas, coding and tabulating, the 
preparation of block summary punch cards is not excessive. If a five 
or ten per cent increase in cost makes possible a many-fold increase 
in the uses of urban data, such additional costs should be justified. 

THE ST. LOUIS BLOCK STATISTICS PROJECT 

Some indication of the possibilities in the use of block statistics may 
be gained by examining the St. Louis experience. In the fall of 1945, 
the local committee on census enumeration areas called the Metro¬ 
politan St. Louis Census Committee, obtained the cooperation of 21 
business, government, welfare and educational establishments in 
s])onsoring a local block statistics project. This involved purchasing a 
dock of block summary cards for St. Louis from the U. S. Census 
Bureau, converting the census block numbers to those used locally 
for over 60 years, and publishing a Block-Street Address Directory 
and Map. The cost of this work was largely covered by the sale of 
directories, maps, and sustaining memberships. The form of the di- 
rectoiy and map, which was published by the offset reproduction of 
machine listings, is indicated in Figure 1. The map location code 
facilitates locating a specific block on the block map. It is also used to 
sort and list cards in geographic sequence to improve the efiBiciency of 



488 


AMEBZCAN STATISTICAL ASSOCIATION JOURNAL, DECEltCBEB 1940 


mapping. The neighborhood district name S3unbol and water district 
code were included in the directory to satisfy two agencies which as¬ 
sisted considerably in its compilation, the City Plan Commission and 
the St. Louis City Water Department. 



BLOCK STREET INDEX — Page 65 


^XOCK 4X98 
4idOO 4S30 
423S 4298 

4300 4398 

I 4400 4498 

U.. 4»2 4598 

\ 4798 
\509a 

\ 5 x 9 a 



2759 

2798 

2797 


HHAHOeT 
XaHARQi.T 
31 DAKOTA 
DAKOTA 
^DAKOTA 
ZaOAKOTA 
S70AK0TA 
ZODAKOTA 
ZSDAKOTA 
ZABELLBR 
zdiBELLER 


HJ3'? 
i»J 35 

PJ3A 

PJ3A 

^11 

/»j3a 

PJ-3P 


X4 

X4 

X4 

X4 

X4 

X4 

X4 

X5 

X5 

X5 

X5 


MICHIGAN AVi 
390X 3999 

4XOX 4199 
4201 4299 

4301 4399 

4401 4459 

4433 4599 

4601 4699 

4701 4799 

4901 S099 

5 x 01 5X99 

S^X 5299 



FIGURE 1. ILLUSTRATION OF FORM OF BLOCK—STREET 
DIRECTORY AND MAP 


The use of locally established block numbers facilitates obtaining 
and compiling current local information. The chief data compiled 
regularly are: 

1. Number of dwelling units in new homes for which building per¬ 
mits have been issued. 

2. Number of dwelling imits in homes for which demolition permits 
have been issued. 

3. Number of white and Negro children enrolled in public elemen¬ 
tary schools. 

These data are useful locally to provide some indication of marked 
increases or decreases in population in particular neighborhoods. The 
school enrollment data are particularly useful in revealing annual 
shifts in the location of the Negro population of St. Louis. Since the 


























THE CITY BLOCK UNIT 


489 


city and school authorities routinely code their records by city block 
numbers, the cost of making block tabulations using punch card ma¬ 
chines is comparatively small. Summaries are tabulated by census 
tracts, neighborhood districts, census districts, precincts, and wards. 
The preparation of these summaries is facilitated by the use of a master 
deck of cards containing a series of code punchings signifying to which 
census tract, neighborhood district, etc., the particular block belongs. 

For recording summary statistics about blocks, use is made of a 
specially printed card illustrated in Figure 2. The fields lettered from 
A to Q are used for quantities, such as the number of dwelling units, 
white school children, or dwelling imits constructed in 1944. The fields 
printed with city block number, U. S. census tract number, block 
number, etc. are used for standard area codes. Reproduced decks of 
cards in the master file can be prepared for use by members having 
their own machine facilities. 


22 22 | 

• 7 * 

*43 331 

4444 4 


66 6 6| 

I 7 1 

3 7 7 71 

4I>|| 

I ] 1 

99S9| 

II j 


OpTOi 


• > 

|2 2 2||2 2 2 | 


0 ii|iri’o|Slilp1fIo i o| 

Hitaii 


"I 

J2 2 2| 

lf|lf Mtl 

33 3 


|3 3 ap 3 3 

4 flr4 4 4 4 4 
MIlH liu Halt 
55dp55SS5 

66«6S 
7 7^0 7 7 


", 

2 2 3l2 2 2| 

»»» HH 

3 3 3 3 3 3| 

4 4 4 14 41 

a» bm; 


I 


sss 

696 

Miin 

in 

86l66li|!66|6l6ll6|88a| 

naiAiiin 

l^’9!2£2S!!nll!2E22Laali.adb.aiiliiaalH«4>«Hl««. 


MairfaaMHHi4a«ck««MB«|uBMteB»Hiaa 

fitjji 


Ml 




aiilaaabiajbaiiaBi HttooHa 

7 7 7l77 77 7 77 7l|777 


|888888|888|888|6I 


444| 

aa 

99 


ID 

MH 
11 
M 
22 
MC 

33 

4|444 


8 

|S99i999|999|9 


oTnnw 

MHtlluUMa 
1 I Ml I I t 


222 


rwm 

tattlHHUtl 
|t I III I I t[ 

|222i22 2 2| 


44 4 




Ml 

44 

H» 

{:; 

aa 

7 

[888 

lav 
l|B9 9 


>5 551 


6666 | 

Bsaaa 


ii 




00 
an 
I It 

222 

uMn 

333 

444 

DM« 

999 

666 

oae 

777 


|000 
anu 
I I I 

222 

awa 

333 

444 

aoa 

999 

666 

aoa 

777 


[8 8 818811 

Haba 


Mb a ate a a iibw a eta a aWa tt a alNlH a nia a « 


a II p idM n »ii| 


[212 21 
II n INK 
[5333 


'lit.. 


ee:::” 

nan 
7 7 7 7] 


T 

Ml nil 


mm 


ooo 

aaa!9 

nig 

. 222^ 
naaa? 
333g 


[999g 




>9S9| 

i66 

in 

|l8l8l8f 

aaa 
I9f 


FIGURE 2. PUNCH CARD FORM USED FOR RECORDING VARIOUS TYPES 
OF STATISTICAL DATA FOR CITY BLOCKS 

ECONOMIC RATING OP BLOCKS 

To provide a convenient means of classifying addresses by economic 
status, a block economic rating on the basis of 1940 rents was prepared. 
The block summary cards from the 1940 census contained information 
on the average rent in each block. These cards were sorted by this aver¬ 
age rent, listed, and at the same time the number of dwelling units in 
each block was cumulated. Those blocks with the lowest rents which 
included one per cent of the dwelling units in the city, were given a 
code of '^01.” The blocks with slightly higher rents which included an¬ 
other one per cent of the homes, were given a code of “02.” This process 















































490 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

was continued until the blocks with the highest rents, which included 
one per cent of the homes, were given a code of "100.” Addresses coded 
in terms of this block economic index, can be conveniently grouped 
into economic tenths, fifths, thirds, etc. Figure 3 illustrates the relation¬ 
ship between the block economic code and average rents. 



FIGURE 3. CU^klULATIVE PER CENT OF HOMES IN ST. LOUIS BLOCKS WITH lESS 
THAN SPECIFIED AVERAGE RENTS 1940 

Some of the uses which have been made of the block economic code 
may be of interest. In a public opinion survey, a random area sample 
was selected, using blocks as primaiy sampling units. Since it was not 
practical to make follow-up calls on every family in the sample, the 
possible biasing effect of differences in the percentage responding from 
low and high income areas was controlled by means of the block eco¬ 
nomic code. Tabulations were made of the number of families in the 
sample from five groups of economic areas classified by means of the 
block economic code. The distribution of usable questioimahes from 
these five groups of economic areas was also determined. Any signifi¬ 
cant differences in the distribution of questionnaires and the distribu¬ 
tion of the families in the sample were corrected by obtaining more 
interviews. In this way, the economic composition of the families 



THE CITT BLOCK TJNIT 491 

represented by the questionnaires analyzed was kept close to the com¬ 
position of the population. 

In a study of subscriptions to the Community Chest obtained 
through neighborhood solicitation, the block economic code was used 
to provide an index of economic status for each solicitation area. A com¬ 
parison between this economic index and subscriptions per family, in¬ 
dicated a marked association as illustrated in Figure 4. This informa¬ 
tion was helpiiJ in determining the areas where neighborhood solicita¬ 
tion produced insufficient returns to justify the costs involved. 

Economic $1 $2 $p $4 $5 



FIGUBE 4. AVEBAGE1947 COMMXmiTY CHEST GIFT PER FAMILY IN NEIGHBOBHOOD 
SOLICITATION OF PABT OP ST. LOUIS CLASSIFIED BY ECONOMIC TENTHS 


Economic 
Tenth 
Lowest 1 


800 


2 


3 

4 

5 


9 

Highest 10 


No. of Dwelling Units: 

Demolished Constructed 

. 400 . 0 . 4 qo . Sep , 1200 , lepo . axy 



FIGURE 6. DWELLING UNITS CONSTRUCTED AND/OR DEMOLISHED UNDER PRIVATE 
AUSPICES IN ST. LOUIS ECONOMIC TENTHS DURING THE PERIOD 1940-4S 


. In a study of dwelling units constructed during the period 1940 to 
1946, the economic status of the blocks in which the construction took 
place was determined by the block economic code. A tabulation of the 



492 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

units constructed and demolished according to economic tenths (each 
tenth containing one-tenth of the 1940 dwelling units) indicated a 
highly skewed distribution as shown in Figure 5. The areas with the 

Economic 

Fifths 

Lowest 1 
2 
8 
4 

Highest 5 

FIGURE 6. MEDIAN WEEKLY HOUSEHOLD INCOME REPORTED IN JULY 1947 
BY FAMILIES IN EACH ECONOMIC FIFTH OF ST. LOUIS 

highest economic status had the most building, and the areas with the 
lowest economic status had the least building going on. On the other 
hand, the high economic areas had the least demolition of homes while 
the low economic areas had the most demolition taking place. 

Several applications of the block economic code have been made in 
classifying addresses in St. Louis City according to economic status. 
Although considerable error can result through such a method for 
estimating the income status of an individual, it is believed that for 
groups of persons or families, this method provides a fairly reliable in¬ 
dex, Figure 6 illustrates the relationship that was found between aver¬ 
age income reported by families in a sample survey and the block 
economic code based on 1940 rents. The interviews with the families 
vrere conducted in July 1947, and each family was asked to indicate 
in which one of the following classes their household income would fall: 

Under $25 per week 
$25 to $49 per week 
$50 to $99 per week 
$100 or more per week 

A study of the economic status of the blocks in which more than 40 
per cent of the population was Negro, indicated a heavy concentration 
of Negro blocks in the lower economic brackets. As shown in Figure 7, 
none of the Negro blocks were in the highest economic tenth, while 
20.3 per cent of the Negro homes were in the lowest economic tenth. 

Although no analysis of vital statistics data has been made, using 
the economic code, it is believed that highly significant differences 


Weekly Household Income 



THE CITT BLOCK CWIT 


493 


would be found from a comparison of life expectancy, infant deaths, 
etc., in low and high income areas. Such analyses require the coding of 

Economic Distribution of Homes Distribution of Homes 

Tenths in White Blocks in Negro Bloc^* s 

Lowest 1 8*1 

2 8.2 

3 8.1 

4 9.2 

5 9.6 

a 10.5 

7 10.9 

8 11.7 

9 11.8 

Highest 10 11.9 

FIGURE 7. COMPARISON OP DISTRIBUTION OF HOMES IN WHITE AND NEGRO AREAS 
OF ST. LOUIS ACCORDING TO ECONOMIC TENTHS 
Note. Each economic tenth included 10 per cent of the homes in St. Louis in 1940. 

births and deaths by blocks, as well as the tabulation of population 
census data by blocks. 

COMPILATIONS K>B SPECIAL AREAS 

The block data on punch cards have been used to obtain summary 
tabulations of housing, school enrollment, and building permit sta¬ 
tistics for such areas as neighborhood districts, precincts, and wards. 
The City Plan Commission has divided the city into 99 neighborhood 
and industrial districts, basing the determination of boundaries upon 
such factors as major streets, railroads, proximity to parks and play¬ 
grounds, land use, etc. Until block statistics were available, it was 
not possible to obtain accurate housing statistics for these basic plan¬ 
ning units. As part of the St. Louis block statistics project, summary 
tabulations of the 1940 housing census block statistics were obtained. 
These included the following data; 

Residential structures Negro families 

Dwelling units (homes) Homes with more than 1.5 persons per room 

Owner occupied homes Homes needing major repairs 

Tenant occupied homes Homes without private bath 

Homes built 1930 to 1939 Average monthly rent of homes 

Homes built 1920 to 1929 Total rent 

Homes built 1900 to 1919 Number reporting rent 

Homes built before 1900 

VOnNQ BEHAVIOE STUDY 

Smnmaries of housing data have been made for the precincts and 
wards of St. Louis. The summarized data were then used to compute 




494 AMEBICA.N STATISTICAL ASSOCIATION JOITBNAL, DECEMBER 1949 

octile ratings for each precinct in each of four housing factors. This was 
done by computing percentages for each precinct, ranking the per¬ 
centages, and then grouping the ranked precincts into eight groups. 
The four factors were as follows: 

1. Per cent of homes owner occupied. 

2. Per cent of homes built before 1900. 

3. Per cent of public school enroUees who were Negro in Nov. 1946. 

4. Average rents. 

The block data were also used to prepare estimates of the population 
21 and over in each precinct as of January 1, 1948. These estimates 
were based upon the current estimated number of families, using the 
1940 count of dwelling units, plus units represented in building permits 
issued since 1940, and less dwelling units represented in permits for 
demolitions since 1940. The estimated number of families was multi¬ 
plied by the 1940 ratio of population 21 and over, to families in the near¬ 
est census tract. The sum of these products for the city was compared 
with the estimated population 21 and over in the city. The provi¬ 
sional estimate for each precinct was then multiplied by a correction 
factor so that the figures finally used add up to the estimated popula¬ 
tion 21 and over in the city. While this method is subject to consider¬ 
able error, it was considered more reliable than any other available 
method for obtaining a current estimate of population 21 and over. 
Percentages and octile ratings were then computed for the proportion 
of the voting population registered to vote. A series of 28 other octile 
ratings was computed from the election statistics on civic issues, as 
well as for political parties in eight elections held since Nov. 1944. 
Comparisons were not made prior to this time because of non-com¬ 
parable precinct boundaries. The data prior to 1948 for each precinct 
were summarized on one specially printed tabulating card. Other 
punched cards were used to list the statistical data onto the printed 
card. Complete sets of 784 precinct data cards were turned over to the 
sponsors of the project. Other sets can be prepared economically from 
the master cards. The top line of each card contains a series of 23 
octile ratings. The percentages upon which these ratings were based 
are specified by small numbers printed below each rating and in the 
lower left comer of each percentage cell. A set of punched cards con¬ 
taining the octile ratings was used for an intercorrelation analysis, 
using the tetrachoric correlation method. This analysis indicated the 
following significant relationships between the housing and voting 
indexes: 



THE CITT BLOCK UNIT 495 

1. Democratic precincts tended to remain Democratic and Repub¬ 
lican precincts tended to remain Republican. 

2. Precincts with a large proportion of the population registered 
tended to have a large proportion of the registrants voting in each 
election, 

3. Areas with high home ownership had a larger proportion of the 
population registered than areas with low home ownership. 

4. High rent areas were more inclined to vote Republican than low 
rent areas. 

5. Areas with many old homes opposed daylight saving time and a 
new state constitution. 

TABULATION OF SCHOOL ENROLLBES 

Each year in November, the Board of Education asks each ele¬ 
mentary school to prepare a report listing the block numbers in which 
their pupils reside and the number of pupils in each block. The Block- 
Street Address Directory is used in coding addresses by blocks. Since 
St. Louis has a completely segregated school system, it is possible to 
make tabulations of these data to show the number of white and 
Negro school enrollces in each block. Such tabulations have been made 
for each of the following years:—1941, 1945, 1946, 9147. From these 
data by block, it has been possible to prepare a map which shows the 
trend of movement of the Negro areas in St. Louis during the period 
1941 to 1947. Figure 8 illustrates the Post Dispatch map^ drawn from 
the more precise block map published in two colors by the Social 
Planning Council and the Urban League. The housing statistics and 
land use data by blocks were used to compute differences in dwelling 
units per residential area between white and Negro areas. 

ESTABLISHING CAMPAIGN DISTRICTS 

Uses of the Block-Street Address Directory can be shown by de¬ 
scribing a project involving the grouping of addresses of campaign 
prospects into convenient solicitation control areas. Cards were punched 
alphabetically giving the names and addresses of about 20,000 pros¬ 
pects. The punching of names and addresses was needed for the prep¬ 
aration of prospect lists, pledge cards, and mailing strips. In punching 
the addresses, house numbers were punched in one field while street 
names were punched in another field. Accordingly, it was possible to 
mechanically sort the cards by street and house number, keeping the 

1 The St Lout8 Post Dispatch evened a feature article by Richard G Baumhoff on this study in 
the Sunday issue, Aug 15, 1948 



496 


AMERICAN STATISTICAIi ASSOCIATION JOURNAL, DECEMBER ig49 


odd house numbers separate from the even house numbers. The use 
of the Block-Street Address Directory, together with a listing of these 
sorted address cards, privided a highly efficient means of establishing 
ing the block codes. In marking the list with the block code, it was 
found that usually from five to 20 adjacent listings would be in the 
same block. The punching of the block code into the cards was ac¬ 
complished by manually filing the cards behind pre-punched block 



The author wishes to acknowledge with thanks the permission granted by the St. Louis Post Dw- 
patck to publish this map. 

master cards and then intersperse gang punching the cards. The cards 
were then sorted down and tabulated by block. Work maps were 
posted with the number of prospects in each block and area bound¬ 
aries were drawn to include the proper number of prospects in each 
area. 

MAKING SPOT MAPS 

In St. Louis, the city block numbers consist of four numerical digits 
and two alphabetical suffixes. Although this makes a rather cumber¬ 
some number, it is useful because many city records and maps are 
referenced with these official city block numbers. However, to locate 









the citt block unit 


497 


any given block efficiently, it is necessary to have what is called a sup¬ 
plementary “map location” number. This number consists of two 
letters followed by two numbers, like “PQ42” which defines a particu¬ 
lar square of land in the city with sides one-fourth mile long. Every 
block is assigned to one specific square on the basis of where the ma¬ 
jority of its area is located. Cards punched with city block number or 
census tract and block number can be automatically gang punched with 
this map location number and other area codes at one operation. They 
can then be sorted and listed in geographic columns and rows which 
greatly facilitates the spotting of block maps to show the accurate dis¬ 
tribution of addresses. 

POTENTIAL USES OP BLOCK DATA 

The foregoing examples represent only a few of the possible uses 
of population and housing statistics by blocks. A consideration of these 
uses should suggest many others which would be made if data were 
available uniformly for every metropolitan area including the suburbs 
as well as the central city. The fact that accurate summaries can be 
made economically for a wide variety of administrative and study 
areas opens up uses which cannot be made of census tract statistics. 

Some of the uses which could be made through a more general avail¬ 
ability of block statistics are indicated in the following list: 

Determination of fire, theft, and life insurance risks in different types of 
neighborhoods. 

Studies of land values as related to population, sales, etc. 

Determination of business areas of the Metropolitan District. 

Planning changes in the location of transportation and utility lines. 
Appraisal of property for loans or taxation. 

Determination of cost of governmental and philanthropic services in each 
section of the city as compared to tax income obtained. 

Planning optimum location of public, private, or commercial facilities for 
recreation, education, sales, health or welfare service, etc. 

Estimates of sales or consumption using block statistics in the design of the 
sampling plan. 

Indexing detail real estate or land use maps. 

Determination of optimum districts for neighborhood improvement, police 
beats, meter readers, relief investigators. 

SUGGESTED NEW CENSUS BLOCK DATA 

One of the limitations upon the use of block statistics is the paucity 
of information generally available by blocks. It is believed that there 
could be a considerable increase in the variety of data tabulated from 
the decennial census without greatly increasing costs. If territory is 
assigned enumerators by blocks (as was done in the 1940 housing 



498 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

census) the punching of a block designation in all punch cards made 
from the schedules would be comparatively simple. Then special 
tabulations could be run by blocks or groupings of blocks for any de¬ 
sired detail. Routinely, it is believed that certain summary informa¬ 
tion could be tabulated by blocks recording the totals in summary 
cards. The following items could be recorded in three decks of such 
cards: 

Sousing data (in order of importance ): 

Number of dwelling units. 

Contract or estimated monthly rent. 

Owner occupied dwelling units. 

Dwelling units occupied by non-white persons. 

Dwelling units built before 1900. 

Dwelling units with no private bath. 

Number of dwelling structures. 

Dwelling units according to type of structure (3 groups). 

Dwelling units without private flush toilet. 

Dwelling units without running water. 

Dwelling units without mechanical refrigeration. 

Dwelling units without central heating. 

Population data (in order of importance)'. 

Number of persons. 

Age distribution (6 groups) 

Sex and color (4 groups). 

Population 25 and over by years of school completed (6 groups). 

Population 14 and over by employment status and sex (12 groups). 

Employed persons 14 and over by major occupation group (9 groups). 

These data could be economically published by listing them on 
plastic offset plates and reproducing several hundred copies for sale 
to users at a charge set to write off the publication cost. Users would be 
encouraged to purchase decks of cards on printed card forms clearly 
indicating the information punched in the cards. Users could also be 
supplied at cost with block maps of adequate scale for work purposes, 
reproduced through a blue print process from masters kept in the 
Census Bureau. Such maps should be in sections that would be small 
enough to be handled easily on normal size desks and drafting tables, 
and yet so made that they could be easily assembled to make up a one- 
piece map for a Metropolitan District. A certain amount of skilled 
consultation service should be made available by the Census Bureau 
to help users in making the best possible use of the block data. 

SUGGESTED LOCAL BLOCK TABULATIONS 

The preparation of block statistics as outlined above would require 



THE CITY BLOCK UOTT 


499 


local committees in each community to help in making the best uses, 
as well as in promoting the compilation of local material. One of the 
first projects for each city committee would be the purchase of decks 
of the summary cards and sets of block maps in the form of negative 
blue print masters. Another project would involve the compilation of 
a local block-street address directory to facilitate the compilation of 
local data by blocks. The promotion of local tabulations should in¬ 
clude consideration for obtaining the following types of information: 

Building erections and demolitions. 

School census or school enrollment data. 

Land use statistics. 

Police arrests. 

Juvenile delinquency cases. 

Births and deaths. 

Persons receiving welfare services (chronically ill, tubercular, mentally ill, 

general hospitalization, foster home placement, etc.). 

Old age assistance, aid to dependent children, and general relief cases. 

Tax assessments and collections. 

Fire losses. 

The financing of local projects can be handled in various ways, de¬ 
pending upon the community. Generally, it is possible for each admin¬ 
istrative agency to include in its budget, small amounts sufficient to 
cover the processing of statistics, falling within its jurisdiction. Sales 
of directories can be used to cover the cost of their compilation and 
publication. Contributions from utilities, banks, chambers of com¬ 
merce, universities, foundations, real estate firms, etc., can be used to 
write off the cost of local projects. However, it is necessary to have 
interested and competent leadership for local committees. Such per¬ 
sons may be found in a local city plan commission, council of social 
agencies, university, chamber of commerce, utility, board of education, 
etc. If interest warrants, it may be possible in certain communities to 
establish agencies equipped with staff and machinery for the most 
efficient processing of statistical data having general community-wide 
significance. Block statistics would be one of the kingpins in such a 
community research agency. 

National agencies and concerns should find considerable uses for 
block statistics when they become imiformly available together with 
adequate maps and street address directories. Survey and polling or¬ 
ganizations should be able to effect economies and improvements in 
their work through the use of block statistics. With adequate materials 
and interpretation, it should be possible to cover some of the added 



600 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

census cost of block statistics through the sale of cards, maps, and 
listings. 

COMPARATIVE ADVANTAGES OP CENSUS TRACT AND BLOCK: DATA 

From the St. Louis experience, we find that block statistics are 
essential for the following t3rpes of analyses: 

1. Compilation of census statistics for administrative areas which 
are not multiples of census tracts or enumeration districts. 

2. Compilation of census statistics for areas within a specified dis¬ 
tance or travel time of a particular geographic location (half mile 
or mile). 

3. Classification of addresses according to economic status based 
upon economic index computed for blocks. 

4. Appraisal of neighborhood characteristics in immediate vicinity 
of a specified address. 

6. Determination of exact boundaries of areas inhabited predom¬ 
inantly by a particular ethnic group. 

6. Design and selection of area samples for use in factual, attitude, 
and opinion surveys. 

Census tract data are inadequate, although better than no data, for 
analyses such as the above. However, census tract data are preferable 
to block data for analyses such as the following: 

1. Community studies of districts or sections within an urban area 
when the population of the district is over 20,000. 

2. Computation of ratios such as tuberculosis death rates for sec¬ 
tions of a city. 

3. Presentation of a general view of the variation from community 
to community within a city in significant population or housing 
characteristics. 

In large cities even census tracts are too small for use in analyses such 
as the above. 

In conclusion, statistical tabulations of selected types of data by city 
blocks supplement, rather than displace tabulations by census tracts. 
Wherever possible, administrative districts within a metropolitan area 
should be established as multiples of census tracts. Block statistics 
should be used only for analyses which require greater geographic 
detail than can be provided by census tract data. The judicious use of 
block and census tract statistics can make a noteworthy contribution 
to the more scientific administration of business and governmental 
services. 



THE RELATION OF THE NET REPRODUCTION 
RATE TO OTHER FERTILITY MEASURES 

T. J. WOOFTBB 

Recent population literature has been critical of the net 
reproduction rate on the grounds that it is based only on the 
female population, that it assumes the invariable continua¬ 
tion of the reproductive situation of a single year, and that it 
ignores the past childbearing experience of the generation. 
Alternative measures are: 

Male reproduction rates; 

Marital reproduction rates, which are of two types—(a) 
those showing the birth rates of a single year standardized 
for duration of marriage, and (b) those showing the number 
of children ever born to women who have been married for 
varying periods of time. 

Generation rates, which are based on the total number of 
children ever bom to a generation of women who have 
completed the childbearing period. 

Standardized quota reproduction rates are proposed in 
order to preserve the generation principle, but center the 
experience measured closer to the current year. 

Rates adjusted for the order of birth of children, as pro« 
posed by Whelpton, may be calculated either from the We 
of a single calendar year or from the complete experience 
of a generation. 

All of these rates may be classified into two types: (1) Those 
which depend on the birth rates of a single year, and (2) those 
which cumulate the experience of a group over a period of 
years. The former are more sensitive to short-time changes in 
the birth rate, while the latter provide a longer and more 
stable base for measuring trends. 

I. CRITICISMS OF THE NET REPRODUCTION RATE^ 

S INCE THE FIRST expositioR of the net reproduction rate and Lotka’s 
demonstration of its relationship to the true rate of natural in¬ 
crease, it has been the measure of fertility used by demographers and 
has provided the techniques for some of the most penetrating analyses 
of fertility which have been produced in the two past decades. Re¬ 
cently, however, the dissatisfaction with the net reproduction rate as 

1 The most recent discussion of the net reproduction ratOt whidi has been ivesented by Lotka, was 
contained in a paper soon to be published in the Proceedings of the Meeting of the Ihtemational Star 
tisdcal Institute, in Washington in 1947. Cf. also Jofwndl ofth$ American Statistical Association, 
June 1936, "The Geographic Distribution of Intrixisio Natural Increase in the United States and the 
Examination of the Relation of Measures of Net Reproduction.” 


501 



502 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

the measure of fertility, especially of fertility trends, has led to a re¬ 
appraisal of the possibilities of alternative measures. 

The gross reproduction rate is the sum of the age-specific female 
fertility rates of all women. The net rate is derived by multipljdng the 
agenspecific female fertility rates by the corresponding female survival 
rates and summing the products; thereby yielding a ratio of the num¬ 
ber of daughters who will survive to the age at which their mothers 
gave them birth to the number of women in that age, the underlying 
assumption being that fertility rates or the mortality rates of a par¬ 
ticular year \\ill remain constant. 

Some of the criticisms of the net reproduction rate seem to have been 
caused by a confusion as to the uses to which the rate may be put. It 
has, in other words, sometimes been carelessly used by those who are 
not familiar with all of its assumptions. This has been especially true 
when attempts have been made to use the customary reproduction 
rates for prediction. Three general uses to which such rates have been 
put may be distinguished: 

Rrst, they are used for the simple comparisons of two or more popu¬ 
lations by means of rates relating to the same year, or of short-term 
variations within the same population. Examples are the comparison 
of urban and rural fertility in 1940, or the comparison of fertility in a 
particular area in two successive years. Obviously, the accomplishment 
of this purpose often requires a measure of fertility performance at a 
particular time which is sensitive to aU of the combined factors af¬ 
fecting fertility. For such general purposes, the crude birth rate is 
often satisfactory, or the age-standardized female birth rate, which is 
the gross reproduction rate, may be used. 

The second objective of such rates is the aDalysis of the factors 
affecting fertility, by attempting to isolate the effect of the principal 
factors being studied. It is especially important, when correlations of 
economic or social variables are made with the birth rate, that as many 
other factors be standardized as possible. The requirement for these 
purposes is a series of measures standardized for different factors in 
order that the difference between standardized and unstandardized 
rates may reveal the fluctuations which arise from the factor under 
consideration. 

The third purpose is the endeavor to predict the future by the 
net reproduction rate of a particular year or of a series of years. This 
interpretation of the rate sometimes creeps into the thinking of the 
layman and even of the uninitiated investigator. This confusion is 
understandable^ since the very term “reproduction^ implies a look 



FEBTIIilTT MBA.STTKES 


503 


into the future. The question is often asked in this way: “Is this popu¬ 
lation reproducing itself?” This is only another way of asking the ques¬ 
tion, “Will the next generation be as large as this generation?” In 
interpreting such a dynamic situation, some demographers have used 
a measure which assumes static conditions. 

All through the recent European literature which criticizes the net 
reproduction rate we find such terms as “invariable fertility,” or 
“relation of the tendency value to the situation of the moment,” or 
“the long-term prospects of population growth”; the implication being 
that, inasmuch as the net reproduction rate does not satisfactorily 
provide such prediction factors, there should be a rate which does. 
This concept, obviously, gets into the realm of philosophical reasoning. 
Are there in various populations inherent fertility trends which change 
slowly, but persist for a long period of time? In recent discussions of 
fertility measures, one gets the impression that some demographers 
t.hinlr that there arc such invariable fertility trends. If we asmnne that 
there is some underlying fertility pattern, there is difilculty not only 
in choosing the proper measure to characterize it, but also, if predic¬ 
tion is attempted, the investigator is confronted with all the technical 
difficulties which beset extrapolation in any field. 

The three principal objections which have been raised against the 
net reproduction rate are: (a) It applies only to females and takes no 
account of the sex ratio and difference m ages of fathers and mothers; 
(b) that, for this rate to be meaningful in practical terms, it assumes 
the invariable extension of the reproductive situation of a particular 
moment; and (c) that it ignores the past in that it does not make allow¬ 
ance for the influence of the past fertility experiences of the women 
who have children in a particular year. 

Even though the recent reversal of the drop in the birth rate may be 
temporary, it has been sharp enough to cause a new crop of analyses 
of fertility and to intensify experimentation with various measures 
which might explain the phenomena. In the case of this country, the 
native white gross reproduction rate fell from about 160 in 1915 to 104 
in 1936 and reboxmded to 159 in 1947. This violent fluctuation is in 
itself concrete evidence of the small probability that the rates of any 
one year will remain invariable, which is a basic aassumption in the 
use of the net reproduction rate as a measure of generation reproduc¬ 
tion. 

Critics of the net reproduction rate argue that, while long-time 
trends may continue with some stability (if such long-time treads can 
be discerned), still the variations in economic and social conditions 



504 AMEHIGAN STATISTICAL ASSOCIATION JOUBNAL, DECEMBER 1949 

and in familial attitudes change rapidly and have different effects on 
short-term fluctuations in fertility. Hence, they have endeavored to 
analyze the effect of these short-term variables and, secondarily, have 
groped for a measure of what might be characterized as the underlying 
fertility trend. The lines of approach which are treated in the following 
pages are: (1) Male reproduction rates; (2) nuptial reproduction rates; 
(3) generation reproduction rates; (4) cohort replacement rates; and 
(5) rates adjusted for order of birth of children. 

It is not possible in the scope of one article to develop the details 
of each of these types of measures, but their chief characteristics may 
be described briefly, especially their relationship to the net reproduction 
rate and the availability of data for their calculation. 

n. MALE NET REPRODUCTION RATES* 

In theory, as well as in practice, the male net reproduction rate is 
similar to the female, except that it measures the number of sons bom 
to 100 fathers who will survive to the age which their father had 
attained when they were bom. It would appear that, if the population 
is assumed to be tending toward an equilibrium, when 100 women 
produce 100 surviving daughters, then it would also be tending toward 
an equilibrium when 100 fathers produce 100 surviving sons. If, there¬ 
fore, for a particular year (as in England in 1938) the paternal net 
reproduction rate was .881 and the maternal only .808, which is the 
most pertinent as the basis for judging the effect of 1938 fertility con¬ 
ditions on future trends? 

Difference between the mean length of male and female generations 
is not the only cause of discrepancies. There is a difference in the pro¬ 
portion married at various ages; there is a difference between the dis¬ 
tribution within the childbearing ages of men and women; and there 
are differential mortality rates. All of these factors are reflected in 
differences between male and female fertility rates. While there is no 
apparent reason for choosing the rate of one sex as more useful than 
the rate of another, comparisons between the two are revealing as 
the effect of the sex ratio and differential age of marriages in the popu¬ 
lation. 

s Myexs, R. J., *The Validity and Significance of Male Net Reproduction Rates,” Journal of the 
American Statiettcal Associacion, Vol. 86, No. 214, June 1941. 

Hajnal, J., ‘Aspects of Recent Mamage Trends in RngiaTiH and Wales,” Population Studiee, Vol. 1, 
No. 1, June 1947. 

Tietse, Christopher, ‘Differential Reproduction in the United States,” American Journal of 
Sooudoffy, Vol. 49, No. 3,1943. 

-‘Differential Reproduction,* MUbank Memorid Quartedy, Vol. 19, No. 8, July 1939. 

Eannd, P. H., ‘The Rations Between the Male and the Female Reproduction Rates,” Popular 
iion Studiee, Vol. 1, No. 8, Dec. 1947. 




FERTILITY MEASURES 


505 


m. NUPTIAL REPRODUCTION RATES® 

Nuptial reproduction rates have been in use for some time, but, in 
recent years, experimentation with these rates by European scholars 
has increased to a great extent. In fact, some European demographers 
assert that such rates are the most satisfactory which have been 
evolved to date. As a result, their techniques have been considerably 
refined, the chief modification being a refinement to allow for duration 
of marriage. The previously used nuptial rates (unadjusted for dura¬ 
tion) were constructed on the same theoretical framework as a con¬ 
ventional net reproduction rate, in that they were derived from age 
and marital specific rates based upon the fertility and mortality ex¬ 
perience of a single year. However, they had the advantage of being 
specific for nuptiality and allowing the student to isolate this factor for 
special study. The use of such a rate as a basis for prediction is analo¬ 
gous to the use of the net reproduction rate, in that it is assumed to in¬ 
dicate the eventual rate of increase of a population with stable age 
structure and invariable extension of the fertility, marriage, and mor¬ 
tality rates of the year in question. For practical purposes, this is 
evidently open to the same objection which the advocates of nuptial 
rates have made against the conventional net reproduction rate. 

Refinement of the general nuptial rate to allow for duration of 
marriage is made in two ways: 

(1) Fertility rates according to nuptiality are calculated for each 
age and duration of marriage, and combined into a rate for all married 
women by the use of a nuptiality table. These rates are again based on 
a single calendar year's experience and, therefore, open to the same 
objection which we continually emphasize; namely, that, as a pre¬ 
dictor, this assumes the invamble extension into the future of a single 
year's fertility, moiiiality, and marital rates, with the added objection 
that another factor (fertility rates by duration of marriage) is also 
assumed to remain constant. (For illustration, cf. Table 1.) 

An ingenious refinements of such rate has recently been proposed by 
Hyrenius.® This method takes into account both the male and 
female age and marital distribution by duration of marriage. Its 
derivation is described by him as follows: "(a) The elaboration of an 
index of the proportion of sexes among the non-married persons within 

> Glass, D. V., Popidation Pdides and Mowmenta in Europe^ Appendix, pages 399-405. 

Hajnal, J., "Anal^^ of Recent International BeooTery in the Birth Bate, ” Popidadan Studies, Vol. 
1, No. 2, September 1947. 

Quesnel, Carl-Erik, ^Population Movements in Sweden in Recent Years,” Populaiion Studies, Vol. 
1, No. 1, June 1047. 

Hyrenius, Hannea, ”La Measure de la Reproduction et de Acoraisment Naturd,” Popidatum, 
April-June 1948. 




506 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

certain age limits; (b) the calculation on the basis of this index and of 
observed nuptiality rates of a table adjusted for feminine nuptiality 
and the distribution of the newly married by age; (c) construction, on 
the basis of mortality and divorce statistics, of a table of attrition of 
marriages for various groups of ages at marriage; adjustment of an 

TABLE 1* 

FEMALE MARITAL FERTILITY RATES BY DURATION OF MARRIAGE 
(Age of \rife: 1&-49 years) 


Duration of Fertility Rate Fer 1,000 Married Women 

Marriage--———- 


(single yean) 

1933 

1939 

1941 

1943 

0- 

363 

320 

825 

376 

1- 

197 

198 

212 

239 

2- 

167 

171 

168 

197 

3- 

144 

154 

151 

195 

4- 

129 

133 

133 

165 

5- 

112 

123 

115 

150 

6- 

100 

109 

101 

135 

7- 

89 

95 

90 

123 

S- 

82 

83 

80 

104 

9- 

73 

74 

70 

90 

10- 

67 

66 

62 

82 

11- 

69 

68 

66 

72 

12- 

65 

64 

61 

63 

13- 

52 

48 

45 

64 

14- 

48 

46 

39 

46 

15- 

40 

34 

35 

39 

16- 

38 

31 

32 

84 

17- 

32 

30 

27 

30 

18- 

25 

25 

22 

24 

19- 

26 

21 

19 

21 


* From ^Population Movements in Sweden in Recent Years,” by Carl-Erik Quensel, Popidatiioit 
Studies, June 1947, p. 34. 

analytical function to this table; (d) calculation, by combining the 
distribution by age of the newly married with the table of attrition of 
marriages, of the distribution of married women by age of the husband 
and duration of marriage; (e) calculation, on the basis of the preceding 
functions of the rate of reproduction and intrinsic rate of natural in¬ 
crease, legitimate fertility being given by age of the mother associated 
with duration of marriage, since illegitimate fertility is only known 
by age groups.” This is, manifestly, a complex computation—one 
which requires more data than are available for most populations. 

(2) When births are recorded, year by year, according to age of 
mother and duration of marriage, it is possible to trace a cohort of 







FERTILITY MEASURES 


507 


marriages back to the date of marriage and construct a table showing 
the number of children ever bom to women married in certain years, 
according to age and duration up to the most recent year available. A 
similar table can be constmcted from census enumerations, which 
record data on duration of marriage, age, and number of children ever 
bom (cf. reference above to Hajnal). (For illustration, cf. Table 2.) 

TABLE 2* 


AVERAGE NUMBER OP CHILDREN TO MARRIAGES. BY DURATION OP 
MARRIAGE, AT THE END OF THE YEARS 1933, 1939 AND 1943 
(Age of wife at marriage, 20-24 years) 


Duration of 
Marriage 
(years) 

Average Number of Children Per Marriage at the End of 

1933 

1939 

1943 

1 

0.45 

0.39 

0.43 

2 

0.68 

0.62 

0.66 

3 

0.S8 

0.84 

0.82 

4 

1.06 

1.01 

1.06 

6 

1.24 

1.14 

1.15 

6 

1.42 

1.27 

1.33 

7 

1.68 

1.41 

1.43 

8 

1.73 

1.63 

1.60 

9 

1.89 

1.64 

1.62 

10 

2.03 

1.74 

1.71 

11 

2.17 

1.85 

1.80 

12 

2.36 

1.97 

1.88 

13 

2.63 

2.09 

1.95 

14 

2.71 

2.20 

2.03 

16 

2.82 

2.31 

2.06 

16 

2.96 

2.41 

2.17 

17 

3.10 

2.52 

2.27 

18 

3.23 

2.69 

2.36 

19 

3.40 

2.83 

2.45 

20 

3.56 

2.97 

2.64 


* From ‘‘Population Movoments in Sweden in Recent Years,” by Carl-Erik Quenael, Populaium 
Studies, June 1947, p. 36. 


Such tables have the advant^e of taking into account the past 
fertility history of women in the cohort by duration of marriage. Cal¬ 
culations based on such tables broaden the base period from the fer¬ 
tility experiences of the most recently recorded year to that of the 
whole range of fertility history of married women in the childbearing 
ages. By malcmg this shift in emphasis, such rates have a distinct ad¬ 
vantage in that they predict the extension of a much longer and more 
stable experience than that of a particular year. Also, they predict the 
extension of an experience in which the influence of previous births 
to the mother is taken into account. If the principal determinator of 
long-time reproduction is the average number of children which married 







508 AMBBICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBBB IMS 

couples will rear during the whole span of their life, then the marriage 
rates based upon the number of children ever bom by duration of 
marriage b^in to develop a body of data which can provide the answer. 
However, the question of predicting fertility is broader than that of 
predicting married fertility alone. For general fertility, the answer 
must take into account variation from time to time in the proportions 
of married people at various ages. 

Unfortunately, also, the dates at which duration of marriage-fer¬ 
tility data have l^ome available are so recent, even in most European 
countries, that insufficient time has elapsed for a reliable trend to ap¬ 
pear. In the United States, reliable fertility data by duration of mar¬ 
riage are not available in any form. Year-to-year cross sections could 
be obtained from Census enumerations if questions were added as to 
the duration of marriage and these were cross-tabulated with the 
items of age and number of children ever bom. Addition of the ques¬ 
tion of duration of marriage to the birth certificate in this country 
would not by itself solve the problem, for it would have to be supple¬ 
mented by accurate nuptiality tables. Otherwise, there would be no 
population base upon which the rate could be calculated, and technical 
difficulties would be injected by reason of plural marriages and broken 
marriages. 

Besides the short span of available data, there are other technical 
complexities in constructing marriage-adjusted fertility rates. Not the 
least of these is the problem of handling illegitimate births. This diffi¬ 
culty is usually recognized by calculation of separate rates for legiti¬ 
mate and illegitimate fertility. Another difficulty arises from the disso¬ 
lution of marriages. If, as indicated above, the ex-married are pro¬ 
gressively eliminated from the nuptiality table, a bias is probably in¬ 
troduced, for the reason that the complete fertility of married couples 
who remain married is greater than that of couples whose married life 
is interrupted during the childbearing period by death, separation, or 
divorce. The Census approach offers a possibility of minimiidng this 
difficulty by the combination of children ever bom to the ex-married 
with children bom to the married. 

In the United States, the fact that Census fertility data are secured 
only from married and ex-married women and the fact that young 
children are not completely enumerated reduces the accuracy of such 
information from enumerations. The addition of the duration of mar¬ 
riage question to the birth certificate would run counter to the estab- 
liriied social policy of eliminating facts as to illegitimacy from birth 
records. 



nSRHIUn ICDAStTRBS 


609 


Theoretically, the objection to the use of number of children ever 
bom by duration of marriage (unless combined in some fashion with 
the numbw of illegitimate births) arises from the fact that it ignores 
the changes in the percentage of the population who are married, be¬ 
cause the base of the rate is narrowed from all women to married 
women. The rates do, however, have the advantage of getting away 
from a single year’s base and the additional advantages of taking into 
account previous fertility history and providing for analysis of the 
effect of duration of marriage as a separate factor. 

Since post-war European investigations have relied so extensively 
on marriage-adjusted rates, it behooves American students to follow 
this development carefully to determine whether it is not desirable to 
develop nuptiality tables and to secure reliable data by age, birth 
order, and duration of marriage. Whelpton, in his recent work, has 
corrected h 3 ^othetical cohorts for spinsterhood, but this correction 
has consisted of an arbitrary reduction of 10 per cent in the married 
population in each age, on the ground that this is the usual propor¬ 
tion of those who remain unmarried until the end of the childbearing 
period. 

rv. GEIfERATION BEPBODUCTION BATES* 

The assumption underlying the net reproduction rate that the fer¬ 
tility and mortality rates of a single calendar year will remain invari¬ 
able may be obviated by accumulating the actual number of female 
children bom in each of the years of life of a generation; i.e., the genera¬ 
tion gross rate is the sum of the age-specific gross female fertility rates 
obtained at dates which are advanced one calendar year for each ad¬ 
vance of one year in the age of the mother. Similarly, appropriate gen¬ 
eration mortality rates may be applied to determine the number of 
these daughters who will survive to the age at which their mother gave 
them birth. Illustration of the arrangement of rates and calculation 
is shown in Table 3. If a reliable Census enumeration of the number of 
children ever bom to women who have lived through the childbearing 
period by various dates is available, a similar but somewhat cmder 
measure may be calculated from the number of births reported by 
women above age 45. This measure corre^onds fairly well to the 
measure calculated from actual generation reproduction frequencies, 
except that in the United States, the Census enumerations tend to 

* Deipoid, M., 'Bepioduction Nette en Europe Depois FOrifsine de FEtat CStiI,* Etudes Demth- 
oraphii^, No. 1, Statistique General de la France. 

Woofter, T. J., 'Completed Generation Reproduction Bates,” Human Biology, Yol. 19, No. 3, Sep* 




510 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 3949 

underestimate the number of children ever bom, either because of 
faulty memory of respondents, or because of the tendency not to re¬ 
port illegitimate children or children of a previous marriage.® Such 
generation measures avoid the assumption of invariable fertility and 
mortality rates by substituting for the rates of a single calendar year 
the actual age-specific fertility rates of a particular cohort of women 
who live from ages 16 to 45. Another advantage is that they are based 

TABLE 3 

CHILDREN PER 1,000 WOMEN, GENERATION BEGINNING 
REPRODUCTION IN 1915 


Children Per 1,000 Women 
Calendar - 


Age 

Years 

let 

Year 

2nd 

Year 

3rd 

Year 

4tli 

Year 

5th 

Year 

16-19 

1915-19 

58 

57 

56 

57 

52 

20-24 

1920-24 

166 

166 

153 

152 

154 

25-29 

1925-29 

148 

142 

139 

131 

127 

30-64 

1930-34 

95 

89 

85 

80 

82 

35-39 

1935-39 

51 

48 

46 

46 

45 

40-44 

1940-44 

17 

17 

15 

16 

17 


Total births per 1,000 women living to age 45 . 2,507 

Female births per 100 women. 121.6 

Female survival rate to age 28,1933 life table*... .898 

Net reproduction rate. 1.092 


* Instead of calculating separately the survival of daughters bom to mothers at each age, accurate 
results may be obtained by calculating the survival of all daughters up to age 28 (the average age of 
mother) by a survival rate appropriate to the calendar year 18 years after the generation began repro¬ 
duction. (Cf. ‘Generation Reproduction Rates,* supra.) 


upon the whole universe of women without elimination of the varia¬ 
tions caused by percentages who are married. That is to say, they 
measure the impact of aU factors operating upon fertility during a 
single generation of women because they are only standardized for age 
and mortality. They also have the advantage of allowing for the effect 
of past fertility performance on the fertility rates of the moment. 
Consequently, the generation rate is a slowly fluctuating measure in 
contrast with the rapidly changing rate of the calendar-year net repro¬ 
duction rate. It, likewise, allows for the changes in the order of birth 
of children by following one group of women all the way through the 
childbearing period. In fact, the author has pointed out in the article 
previously cited that the generation method may be applied to the 
birth rates of children of the first, second, third, and higher orders, as 
well as to the total fertility rate. In these cases, the sum of the birth 


> Bbr such oompatiaona, see ‘Completed Gezieration Reptoduction Ratea,* cited ia Footnote 4. 











FERTILITY MEASURES 


511 


rates of children of all orders in each age is equal to the gross repro¬ 
duction rate. 

An allowance for improvement in mortality by use of a generation 
life table results in a marked increase in net rates over those calculated 
on the assumption of the invariable continuation of current survival 
rates. The difiFerence may be illustrated by the effect on the women 
born in 1900. When these women were 15 to 19 in the years 1915 to 
1919, their children had a probability of surviving to age 17| of only 
.878. The children born to the same generation of women when they 
were age 40 to 45 have a probability of surviving to age 17^ of .948,® 
an improvement of 8 per cent in 25 years (Table 4). In fact, the chil¬ 
dren born in 1940 had a probability of surviving to age 40 which is 
superior to the probability that children born in 1915 would survivive 
to age 5. 

TABLE 4 

5-YEAR GENERATION SURVIVAL RATES (MA) OP WHITE FEMALE 
CHILDREN UP TO AGE OF MOTHERS FOR GENERATIONS 
BEGINNING 1015-1940 AND 1920-1040 


Mother’s Age 
When Child 
Was Born 


Initial Reproduction Year of Generation 


1915 

1020 

1925 

1930 

1935 

1940 

15-19 

.878 

.900 

.915 

.927 

.938 

.948 

20-24 

.895 

.910 

.922 

.934 

.943 


25-29 

.905 

.917 

.929 

.938 



30-34 

.911 

.922 

.933 




35-39 

.917 

.925 





40-44 

.916 







It has been pointed out, however, that these rates have one serious 
disadvantage, namely, the longer time span covered. The complete 
generation experience of a cohort of women can only be recorded after 
age 45, which means that the child bearing experience extends back 
for 30 years. Likewise, the children who are born to women age 45 re¬ 
main at risk of death for another 45 years, making the complete span 
of generation replacement performance in the neighborhood of 75 
years. ^ This necessitates either carrying birth rates far into the past or 

* In making such calculations for recent generations, it is necessary to project survival rates into 
the future to some extent. The generation life tables used in this and the previous article by the author 
on this subject were calculated by using actual survival rates up to 1940 and Whelpton’s medium mor¬ 
tality assumptions thereafter. 

7 The average span is, however, much shorter. The author has pointed out in a previous article (cf. 
Footnote 4) that an accurate method of calculating generation mortality of the children bom to a gen¬ 
eration of mothers is to apply the single survival rate to age 28 (the average age at which mothers bore 
children) from a life table applying to a date 18 years after the generation begins child-bearing; i.e., the 
children bom to women beginning childbearing in 1915 would survive at an average rate equal to 
survival to age 28 in 1933. Thus survival rates do not have to be estimated after the terminal child¬ 
bearing age. 








512 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER mO 

projecting them far into the future, and of making similar projections 
of survival rates. The whole United States was included in the birth 
registration area only in the 1930's. Whelpton has estimated rates 
back to 1920, and similar estimates for the group of mothers 15 to 19 
may be made with some confidence back to 1915, with the result that 
only the generations beginning in 1915,1916,1917,1918, and 1919 can 
be satisfactorily estimated. (These generations reached the age 44 
in the years 1944, 1945, 1946, 1947, and 1948, and the daughters of the 
older mothers of these generations will be in their older childbearing 
ages from the years 1989 to 1993; but the average mortality of daugh¬ 
ters of these generations may be closely approximated from life tables 
of 1933 to 1937.) It may be remarked in passing, however, that the 
same difl&culty arises in converting gross rates based on the total 
number of chfidren born by duration of marriage to net rates, for the 
reason that by the time a cohort of married women nears the end of the 
childbearing period, the duration of their marriage has been from 20 to 
25 years, 

V. COHORT REPLACEMENT RATES 

The calendar year net reproduction rate is criticized on the ground 
that it projects temporary conditions too far into the future; the gen¬ 
eration rate is criticized because it reflects conditions of a number of 
years past. The objections to both of these rates are partially obviated 
if a measure is used which reflects the cumulated number of births per 
100 women of all childbearing ages up to a particular date, regardless 
of what age has been attained by that date. That is to say, in 1945 the 
women born in 1900 had attained age 45; those born in 1901 had at¬ 
tained the age 44; etc. Consequently, only the oldest cohort has com¬ 
pleted childbearing; the others have varying periods to complete. If 
the cumulated reproduction of all reproducing cohorts® is set in rela¬ 
tionship to a “standard” performance which would result in a net re¬ 
production rate of 100, the result is an approximate measure of the 
rate of reproduction of all women of all ages up to the date at which 
the calculation is made. 

As the cohort quota replacement rate has not previously been dis¬ 
cussed in print, the following notation is introduced in relation to 
Tables 4, 5, and 6: 

Pc*Female generation births per 100 women at age a. 
_ Jfa=g Female generation survival rate up to age a, _ 

8 In the calculations presented in Tables 6 and 6, generations 5 years apart axe sho^ instead of 
fliose beginning in every calendar year. 




FEBTILITT MBASUBBS 


513 


AfoPo=Net reproduction at age a. 

»=Number of generations in which an age group repro¬ 
duces during the period for which the rate is cidcu- 
lated. 


MaPa = 


n 


= Mean age-specific net reproduction. 


45 _ 

(1) 2 Average cohort replacement for all cohorts. 

15 


MaPa—— -=Age replacement quota. 

16 

a+6 

(2) X] AfaPa® = Cohort replacement quota. 

16 


The data for these calculations are arranged exactly as they are in 
the calculation of the generation net reproduction rate—^i.e., by ob¬ 
taining the female age-specific fertility rate for women of each age in 
the calendar year when they attained that age (Table 5) and multiply- 


TABLB 5 

FIVE-YEAR FEMALE BIRTHS PER 100 WOMEN IN EACH 
AGE—COHORTS BORN 1900-1926* (PA) 


Age of Women 
When Children 
Were Bom 

1900 

1906 

Women Bom in: 
1910 1915 

1920 

1925 

1915 

1920 

Readied Age 16 in: 
1926 1930 

1935 

1940 

16-19 

13.68 

13.68 

13.00 

11.06 

10.91 

12.03 

20-24 

38.63 

34.29 

30.66 

30.76 

35.94 


26-29 

3S.42 

29.10 

28.28 

33.66 



30-64 

20.80 

19.11 

22.07 




36-39 

11.46 

11.83 





40^ 

3.98 







* Based on revised estimates of P. K. Whelpton for under-registration and incompleteness of regis¬ 
tration area. U. S. Bureau of the Census. 'Forecasts of the Population of the United States,” p. 17. 


ing by the generation age-specific survival rate from 0 up to that age 
(Table 4). The results of these calculations are shown in Table 6 under 
the heading, “Surviving Female Children per 100." 

With this arrangement of the data, the next step is to compute the 
average reproduction in each age These averages are diown 







514 


AMBBICAN STATIBTICAIi ASSOCIATION JOURNAL, BEOBMBEB 194 S 


3 'O *-s 

iiih 


t i 
sS ® 

i 1 


.as 

s 


B 

^ o 


.a I 

lO 1-4 


I 00 CO 

I iH C4 to 


eo N o 




I QD « O 


cq CO *o « o o 

O CO C4 O lO ^ 

i-I O 00 o eo 

1-1 eo CO iH 1-1 


s I 


9 S 

w* d 


8 S 


SS 8 


8 8 


§ § 


O <<• O) ^ Ob 

UlUl 

1 -^ « « CO eo 




FBBTIUTT MEASUEBS 


515 


in the next to the last column of Table 6. The sum of these averages 
(Formula 1) yields an average net reproduction rate for all women of 
all childbearing ages from the time when they were 15 until the date of 
the calculation. 

This calculation may be extended to determine a quota of “normal” 
reproduction to which each individual cohort may be compared in 
order to measure its relative performance up to the ^te of the calcula¬ 
tion. These quotas are established as follows: Divide each of the 
average age-specific reproduction rates by the total of these 

rates, thus converting the rates for each age into the percentage of 
births which normally occur in that age. Since the sum of the percent¬ 
ages equals 1, they determine age-specific frequencies which, if equaled, 
would result in a net reproduction rate of 1; thus, the extent of devia¬ 
tion of actual reproduction from such a quota measures the extent of 
the deviation of reproduction from a standard stationary rate. These 
quotas are shown in the last column of Table 6, and are cumulated in 
the next to the last line of Table 6 for comparison with actual cumu¬ 
lated reproduction. It is thus possible to compare any specific age in 
any cohort with a corresponding quota, or to compare the cumulated 
reproduction of each cohort with its quota. 

Whereas the major weighting of the experience of a completed gen¬ 
eration is 17 years before the end of their experience (when women are 
age 28), the major weighting of the experience of the total population 
is about 11 years preceding the date of the calculation. It will be noted 
that in Table 6,11 of the 21 cohort-age groups measure fertility for the 
10 years immediately preceding the date of the calculation, and 10 of 
the 21 relate to the older generations whose experience extends from 
10 to 30 years back of the date of calculation. A series of such cohort 
replacement rates is even more stable than the rate for a series of gen¬ 
erations. 

In avoiding some of the disadvantages of other rates, however, such 
calculations also lose some of their advantages, since they refer neither 
wholly to the present nor wholly to the complete experience of gen¬ 
erations. However, a series of such calculations extending over a num¬ 
ber of years would provide a basis for extrapolation which is more 
sensitive to present conditions than a series of generation rates. 

Whelpton presented fertility data for such incomplete cohorts in a 
paper as yet unpublished.* He did not, however, convert the gross 
reproduction rates of these incomplete cohorts into net rates. 


Paper delivered at the 1947 Session of the International Statistical Institute. 




516 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 


VI. PARITT ADJUSTED RATES^® 

Although Whelpton has published rates adjusted for age, parity, 
fecundity, and marriage, we shall discuss at this point only the adjust¬ 
ments which were made for parity. His observations led to the conclu¬ 
sion that, if birth rates by order of birth, as calculated for a single ab¬ 
normal calendar year, are assumed to apply to a cohort of women who 
actually pass through the childbearing period, impossible results can be 
obtained. Such calculation for the year 1942 produced the impossible 
rate of 1,084 first births per 1,000 women. (With the 1947 calendar 
year as a basis, the results would be still more impossible.) 

His first approach to the solution of this diflBiculty was to standardize 
for birth order by beginning with the life table cohort of women who 
had had no children and successively deducting the mothers who bore a 
first child and survived to the next age from the base of the rate for 
second births, and so on for each higher order of birth. The resulting 
parity-adjusted rate had the advantage of avoiding impossible results, 
but was still open to the objections which have been made to rates 
which generalize the experience of a single calendar year. Realizing 
this, Whelpton in some of his later (unpublished) work has made cal¬ 
culations according to order of birth on the basis of the actual recorded 
generation experience. If recorded statistics are available, this method 
has all of the advantages and disadvantages of the generation ratio de¬ 
scribed in the preceding section. The period for which Whelpton has 
made his later generation birth order calculations is the same as that 
used for the generation reproduction rates calculated by the author. 
They are thus susceptible to reduction to net rates by the use of gen¬ 
eration life tables and have the obvious advantage of being adapted to 
studies in which birth order is an important factor singled out for 
special consideration. 

VII. CONCLUSION 

The variety of experimentation with rates in all countries where 
data are available should provide the basis for extending and refining 
the analysis of fertility. The development of a variety of techniques 
should, therefore, be welcomed, A review of the foregoing discussion 
reveals that no one of these rates is most appropriate for all purposes, 
but that each is well adapted to some particular purpose. Perhaps the 
effort of some demographers to arrive at a single “optimum” measure 
which will reflect the condition of the moment and at the same time 

w Whelpton, P. K., “Reproduction Rates Adjusted for Age, Parity, Fecundity, and Marriage," 
Journal of iho American StaHsHcal Aesociation, December 1946. 



FERTILITY MEASURES 


517 


provide a basis for the measurement of underlying fertility trends is 
as dilficult as it is “to eat one's cake and have it, too.” Certainly no 
measure reviewed in this article is well adapted to both purposes. 
Demographers who are interested in “invariable fertility trends” 
might do well to review the history of economic prediction which relies 
not on one but on the variety of interacting indices. 

It will be observed that the rates summarized above fall into two 
general groups: 

(1) Those which use as their basis the experience of a single calendar 
year and thus concentrate on the condition of the moment. They are 
subject to violent fluctuations, such as those that have taken place in 
the last 20 years. These fluctuations make it diflBicult to select the prop¬ 
er base period from which to project a trend and to determine the 
shape of the curve which is being projected. 

(2) Those which follow the generation technique of recording the 
actual fertility experience of mothers at a particular age in the year 
during which they attained that age. These rates include the general 
generation rate, the rates based on the cumulated number of children 
born to women who were married at a certain date and age, and the 
cumulated birth order rates based on actual chronological experience. 
This latter type is not so sensitive to year-to-year changes, and, con¬ 
sequently, does not reflect conditions of the immediate present as 
accurately as do rates based on the conventional net reproduction 
technique. They are, however, more stable and are based on a longer 
time period, reflecting the impact of past fertility on present rates. 
Not the least of the advantages of the calculations of the generation 
type is the fact that they are adapted to conversion to net rates by al¬ 
lowing for generation improvement in mortality. 



ON ESTIMATING THE MEAN AND STANDARD 
DEVIATION OF TRUNCATED NORMAL 
DISTRIBUTIONS* 


A. C. Cohen, Jr. 

University of Georgia 

The problem considered is that of estimating the mean and 
standard deviation of a normally distributed population from 
a truncated sample when neither count nor measurements of 
variates in the omitted portion of the sample is known. For¬ 
mulas are developed whereby certain special functions required- 
in solutions given for this problem by Karl Pearson and Alice 
Lee and by R. A. Fisher may be readily evaluated with the 
aid of an ordinary table of the areas and ordinates of the nor¬ 
mal curve. A method of successive approximations is illus¬ 
trated which, with the aid of the above formulas permits the 
utilization of either the Pearson-Lee or the equivalent Fisher 
method to obtain the desired estimates with an improvement 
in accuracy regardless of whether or not the special tables or¬ 
dinarily required by these two methods are available. 


I. PEARSON-LEE METHOD 

K ahl peabson and Alice Lee [l], and Alice Lee [2] employed the 
method of moments as early as 1908 to develop formulas which 
may be used to estimate the mean and standard deviation of a normally 
distributed population from data provided by a truncated random 
sample from which all record including both count and measurements 
of all variates whose value is beyond a given truncation point, has 
been omitted. Their results except for minor changes in notation may 
be summarized as follows: 


( 1 ) 

( 2 ) 

( 3 ) 


m' = lo' — h'ff' 

fff — — 

n 

** ~ (2 a:)* 
(2 a:)* 


In the above equations, m' and cr' are estimates of the mean and 
standard deviation of the population (complete distribution). Xo' is the 
point of truncation measured on the original scale of the variate x'. 
The omitted portion of the sample is here considered to be to the left 
of Xo\ The summations and are taken about xo' as an origin. 


* A portion of this paper was presented before the Southeastern Section of the Mathematical Asso- 
dation of America at Tuscaloosa, Alabama, March 19,1949. 


518 



TBXmCATED NOBMAL DISTBIBimONS 519 

h' is the point of truncation measured in standard units of the popula¬ 
tion; that is, 



n is the number of variates in the truncated sample, and 4>i are 
moment functions of V. Tables of both these fimctions evaluated to 
three places of decimal at intervals of 0.1 in ¥ are contained in the 
original papers and also in “Tables for Statisticians and Biometiici- 
ans” [3 ] Vol. I. Table XI and Vol. IL Table XII. To apply the Fearson- 
Lee method, one proceeds as follows: 

(1) Evaluate the left side of Equation (3) from the sample data. 
Enter the table of with this value and obtain V by inverse inter¬ 
polation. 

(2) Using the above value of ¥ as the argument, read from the 
appropriate table and apply Equation (2) to obtain <r'. 

(3) With both <r' and ¥ determined, then apply Equation (1) to 
obtain m'. 

Unfortunately the tables required for this method are not as widely 
distributed as loight be desired. Furthermore the entries contain too 
few significant digits and are tabulated at an interval of the argument 
(0.1) that is too wide to permit ¥ to be determined with sufficient 
accuracy for many applications. 


II. FISHKR MBTHOD 

In 1931, R. A. Fisher [4] demonstrated that the “Maximum likeli¬ 
hood” estimates for this problem are identical with those obtained 
by the method of moments. His results, however, were expressed in a 
slightly different form from those of Pearson and Lee. Fisher employed 
a moment function of ¥ (designated by £ in his discussion) which he 
labeled as an fimction and which may be defined as: 


( 4 ) 


I»(A') 


1 r* 


0 - AO” 

nl 




and for which the following relations hold: 


( 6 ) 

( 6 ) 


(« + l)/»+i + ¥U - = 0; 


dh' 




n > — 1. 


The Fisher results in terms of the /„ functions are: 



520 


AMEBICAN STATIBTICAli ASSOdATIOK JOTTBNAL, DECEMBER 1949 


( 7 ) = 

n Ii 
and 

2JoIs 

/i* 

His equation for obtaining m* after <r' and h' are determined is the 
same as that given by Pearson and Lee (Equation 1.) 

Tables of /o, h and loh/h^ (labeled HhoHh/iHhiy) as required for 
use in the Fisher formulas (Equations 7 and 8) are included in “Mathe¬ 
matical Tables” VoL 1 of the British Association for the Advancement 
of Science. Entries in these tables are given to a greater number of 
decimals than the Pearson-Lee tables (from 6 to 9 significant digits 
for most entries) but the interval of the argument h' (0.1) is the same 
as for the Pearson-Lee tables. The greater number of significant digits 
permits greater accuracy in determining A', but only by resorting to 
inverse interpolation formulas involving the second and higher order 
differences. At best such computations are rather tedious and some¬ 
what bothersome to carry out. With regard to availability, the B.A. A.S. 
Tables are perhaps even less widely distributed than the Pearson 
Tables. The application of the Fisher results is almost identical with 
that of the Pearson-Lee results. The quantity on the left side of (8) is 
computed from the sample data and A' is determined by inverse inter¬ 
polation from the table of lolz/h^- Using the value of A' thus de¬ 
termined, Iq and h are obtained from the appropriate tables and a-' is 
computed by use of Equation (7). Equation (1) is then used as before 
to obtain m'. 

III. EQUIVALENCE OF PEARSON-LEE AND FISHER RESULTS 

Since Equations (2) and (3) are equivalent to (7) and (8) it follows 


that: 


(9) 

lAa = h/h 

and 


(10) 

2IoIi 




IV. NEW CONTRIBUTIONS 

In the present paper, equations are derived which permit the calcu¬ 
lation of ^ 1 , ^2 and likewise 2 / 0 I 2 /I 1 * without resort to any tables 
other than an ordinary table of areas and ordinates of the normal 
frequency curve such as can be found in practically any handbook of 



TBtrNCATED NOBUAI. DISTBIBTmONS 


521 


mathematical tables. By using the formulas presented herein, the 
Pearson-Lee or the Ksher technique can be readily applied regardless 
of the availability of the special tables previously mentioned. Even 
when the special tables are available, the formulas developed in this 
paper permit the attainment of greater accuracy in determining h' 
and consequently in obtaining <r' and m' with a minimiuu computing 
effort. 

V. DEBIVATIONS 

Let n=0 and 1 respectively in Equation (6) to obtain 
(11) h = I-i - h’h 

and 


(12) 2Is = Jo - h'h. 

Let n=0 in Equation (6) and we have dIo/dk'=—I^. If now n=0 
in Equation (4) it follows that 


(13) 


Jo 





which is recognized as the area under the normal curve to the right of 
the ordinate t=h'. Direct differentiation of (13) gives dIo/dh'=—<l>{h') 
where is the ordinate of the normal curve at t=h'. 


Consequently we may write 

(14) I^(h') = ^(AO. 


Upon substituting the results of Equations (11), (12) and (14) in the 
ri^t side of Equation (8) it follows that 


(15) 

Now if we define 

(16) 


2JoJ* [Jo - - A'Jo)]Jo 

~ 1 ? [ 4 , - A'Jo]* 


Z(h') = 


<^(A0 

Io(AO’ 


and divide both the numerator and denominator of the right side of 
(15) by Jo* we obtain 




522 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 


From Equation (10) it then follows that 


(18) 





Similarly from Equations (11) and (14) we obtain 


(19) 


h 

h 




1 

Z-h' 


Equations (17), (18) and (19) are thus expressed in convenient forms 
to permit rapid calculation of the special functions required by both 
the Pearson-Lee and the Fisher methods for any values of the argument 
h' as may be necessary, provided only that existing tables of the areas 
and ordinates of the normal frequency curve are available. These 
formulas would also be useful in extending both the Pearson-Lee and 
the Fisher tables by computing additional entries at closer intervals 
of the argument h\ 

VI. DETERMINING h' BY SUCCESSIVE APPROXIMATIONS 

Although tables are used in both the Pearson-Lee and the Fisher 
methods, the basic problem involved is the solution of either Equation 
(3) or its equivalent Equation (8) for h'. A method of successive 
approximation which makes use of (17), (18), and (19) and simple 
linear interpolation has been found to be quite satisfactory for de¬ 
termining h' to additional significant digits. If either the Pearson-Lee 
or the Rsher tables are available they might be used to furnish a 
first approximation of For use when neither of these tables are 
available, a graph of for values of A' and from —3.5 to +3.5 is 
given in Figure 1. In the absence of both tables and graph a reasonably 
satisfactory first approximation to A' can be obtained from 

(20) A' ~ (oJo' — x)/sz 

where x is the mean and Sx is the standard deviation of the truncated 
sample. To compute s* use the formula 

(ns*)2 = nX) (S 


vn. NUMERICAL EXAMPLE 

The various steps involved in determining A' and subsequently <r' 
and m' by the successive approximation technique mentioned in the 
previous paragraph can best be understood by computing these 
quantities for a typical set of data such as the following: 

n=37; a;o'«0.850;i;;a:=51.8600; and98.0156 




FIGURE 1 

GRAPH SHOWING RELATION BETWEEN AND h> 




































































































































524 


AMEBICAN STATISTICAL ASSOCIATION JOUBNAL; DECEMBEB 1949 


where x has been measured from the terminus as an origin ( 0 ;*== —xo'). 

Thus x = 1.401622; s*=0.83; and 


nJ2x^ — (^xy 


0.348441. 


From the graph of Figure 1, we find hi'= —1.20 as a first approxima¬ 
tion. By direct substitution in Equation (18) it is found that ^i(—1.20) 
=0.341734. Since this value is less than 0.348441 and since if/i is an 
increasing fimction of h', it is necessary to select a value greater than 
—1.20 as a next approximation. We then find that —1.20 <h'< — 1.10. 
linear interpolation gives hi= —1.16 as a closer approximation and 
follo'wing the next step we establish that —1.165 <h'< — 1.164. Again 
linear interpolation ^ves ha'=1.1643 which is accepted as a final ap¬ 
proximation since this determination is sufficiently accurate for our 
purposes. 

The tables of areas and ordinates of the normal distribution used in 
making these computations contained six significant digits and were 
tabulated at intervals of 0.01 for the argument. In using these tables 
it was necessary to employ interpolation for obtaining entries only for 
the two final approximations. Table 1 details the computations involved 
in the various steps described above. 


tabu: 1 





5f 



1 


h' 

U 


(3)-s-C2) 

(4) -(1) 

1/(5) 

.. 

Z-h’ 

(6)-(4) 

Vt 

(6)X(7) 

(1) 

(2) 

(3) 

(4) 

(6) 

(6) 

(7) 

(8) 

-1.10 

-1.20 

.864334 

.884930 

.217852 

.194186 

.252046 

.219436 

1.352046 

1.419436 

.739620 

.704505 

.487574 

.485069 

.360619 

.341734 

-1.16 

-1.17 

.876976 

.879000 

.203571 

.201214 

.232128 

.228912 

1.392128 

1.398912 

.718325 

.714841 

.486197 

.485929 

.349247 

.347362 

-1.164 

-1.165 

.877786 

.877988 

.202628 

.202392 

.230840 

.230518 

1.394840 

1.395518 

.716928 

.716580 

.486088 

.486062 

.348490 

.348302 


It has been the writer's experience that by systematically arranging 
computations as shown in the above table the tedium of making each 
individual determination of is considerably reduced. 

By interpolation from Table 1 it is readily found that 

(Z - *0 1*'—1.1M. * 1.396043 



























TEXTNCATBD NORMAL DISTBIBimONS 525 

and it follows from Equations (2) and (19) that 

= 1.401622/1.395043 = 1.0047. 

From Equation (1) it then follows thatm'=0.850-(1.0047)(-1.1643) 
=2.020. This completes the solution. 


Vin. TABIANCiN OF ESTIMATES 

R. A. Fisher obtained the following formulas for the variance of <r 
and h' (Fisher’s |) 

^ ^ ^ ^ + <r*(2/i2 — MiO} 

(22) Vik') = -y ,.,,, „ > . 

+ <r®(2Ats — /t*0} 

and for the correlation between the sampling errors of 0 - and V 

+ <rfii' 


(23) 


r,^> = 


■n/w(m*' + O 


where is the Arth moment of the truncated sample about its terminus 
xt' and Ht is the Mi central moment of the truncated sample. 

It can be shown that in terms of the fimetion Z defined by Equation 
(16) that the above equations become 


(28) 


l-ZiZ- h') 


Z{Z - AO][2 - h'{Z 

2 - h’{Z - AO 


il _ 

-W] - (Z-hVi' 

rl’ 


(26) = 


[1 - ZiZ - A0][2 - h'iZ -h')]-(,Z-h 
+ (Z-h>) 


V[1 - Z{Z - A0][2 - h'{Z - AO] 


REFERENCES 

[1] Pearson, Karl and Lee, Alice, “On tlie Generalized Probable Error in Multi¬ 
ple Normal Correlation,” Biometrika, Vol. 6,1908, pp. 59-68. 

[2] Lee, Alice, “Table of Gaussian ‘Tail’ Functions "W^en the ‘Tail’ is Larger 
Than the Body,” Biometrika, Vol. 10,1915, pp. 208-215. 

[3] Pearson, Karl, editor (1914, 1931). Tables for Statisticians and Biometricians, 
Part I (1914, 3rd edition 1930) and Part II (1931), Cambridge University 
Press. 

[4] Fisher, R. A., “Properties and Applications of Hh Functions,” Introduction 
to Mathematical Tables, Vol. 1, British Assn, for the Advancement of Science, 
1931, pp. xxvi-xxxv. 



ON SOME MATHEMATICAL PROBLEMS ARISING IN 
THE DEVELOPMENT OF MENDELIAN GENETICS* 

Hilda Geiringeb 
Wheaton College, Massachusetts 

In this paper some basic problems of theoretical Mendelian 
genetics are presented. The general situation of an arbitrary 
number of linked loci for "diploids” (normal organisms) as 
well as for "autopolyploids” is considered, under random mat¬ 
ing. Particular stress is laid upon formulating the problems 
and results within the framework of probability theory, ex¬ 
pressing them in terms of three basic distributions, the distri¬ 
butions of genotypes and of gametes, and the segregation 
distribution. The task of the mathematical theory consists in 
establishing recurrence relations for the distributions of gam¬ 
etes and of genotypes, in integrating them and in investigating 
the limit behaviour of those distribution as n, the number of 
discrete non overlapping generations tends toward infinity. 

I N A PAPER delivered in 1935 at the Massachusetts Institute of 
Technology, J. B. S. Haldane [9, g] spoke on “Some Problems of 
Mathematical Biology.” In this lecture, which covers a much broader 
subject than the present paper, he suggests the following classification 
of the problems of mathematical biology. A first group of problems is 
concerned with the life of the cell; next, one may consider the analysis 
of a whole organism composed of cells, like a tree or an animal; on a 
third level, one studies the mutual relations of a number of organisms, 
—^members of the same brood, individuals of the same population,— 
and investigates the biological fate of such populations. 

The problems dealt with in the present paper belong to the third 
group, which is concerned with entire populations. The mathematical 
tools for these investigations are offered by probability theory and 
statistics, since this is the branch of mathematics which deals with 
mass phenomena. However, we will not consider here the manifold 
problems connected with the statistical evaluation of biological obser¬ 
vations. Prom a systematic standpoint problems of biological statistics 
are not different from the statistical problems occurring in connection 
with other series of observations. The more specific task of mathe¬ 
matical genetics is comparable to what is done in the kinetic theory of 
gases, in the theory of quanta, etc. One starts by formulating proba¬ 
bility laws that express certain biological facts and then tries to deduce 

* Iicotate ddiveied Juinaiy 1948 at the UiUTersity of Cbioaco. 


526 



mendeuan genetics 


527 


in a mathematical way consequences which are to be checked by ob¬ 
servations. The simplest example of such a procedure presents itself 
in Mendel's theory of the heredity of a single character. 

1. BANDOM INHERITANCE OF ONE CHARACTER. THE BASIC 
PROBABILITY DISTRIBUTIONS 

Mendel [17] studied certain single traits or characters, like the 
heredity of the color of the flower of peas or the size (tall or short) of 
this plant, by actually counting the numbers of each type of progeny 
which resulted from a given cross, thus applying statistics to the 
phenomena of inheritance. His fundamental discoveries were made on 
the garden pea, Pisum sativum. Systematic observations and analysis 
led him to the following daring hypothesis. The visible color (red or 
white in case of the peas,—^red, white or pink in case of four-o'clocks) 
is dependent on a pair of factors which we may denote for the moment 
by R and W, Each fertilized zygote (and each cell of the organism into 
which it develops) contains two of these factors and may thus be of the 
type RR, or RW, or WW, The gametes (egg and sperm) contain 
one factor only, selected at random from among the two factors con¬ 
tained in the cell out of which the gamete is formed. A new zygote of 
the following (filial) generation is then formed by the random union 
of two gametes, and, consequently, again contains two factors, and so 
on. 

Such factors have been called Mendelian factors, unit factors, or 
genes. There is no doubt, today, that the genes are located in the 
chromosomes. Two genes alternative to one another, like R and TF, 
are spoken of as allelomorphic factors, or alleles. There may be more 
than two members of such an "allelomorphic series” e.g. i=tall, 
s=short, and ci=dwarf with respect to size. A zygote which contains 
two alleles of the same kind, like RR, or TFTT, or it, etc., is said to be 
homozygous for the factor in question while an jRTF-plant or a plant of 
type id, is called heterozygous or hybrid. Locus means the particular 
place in a particular chromosome at which there are alleles. Character 
relates to the effect of genes. A single character, like color, may be 
affected by many loci and, on the other hand, the same locus may 
influence several visible characters. In the case of the flower-color of 
peas, however, this character is determined by one locus. 

In the first paragraph of this section we have presented the essential 
content of Mendel's famous first law. Let us attempt a more mathe¬ 
matical formulation. We assume that, corresponding to each locus, 
there exists a random variable^ x, which may take on r distinct values, 



528 


AMEEICAN STATISTICAL ASSOCIATION JOUENAL, DECEMBER 1949 


the r alleles, a;=ai, 02 , • • • , dr. In the example of the color of peas, 
r=2; here tti stands for “red” and 02 for “white.” ITAree alleles determine 
the human blood groups, and there are at least r=14 known alleles 
for the eye color of Drosophila melanogaster, the small vinegar fly which 
plays as important a role in modem genetics as did MendePs peas a 
century ago. With respect to such a locus the genetic type of an or¬ 
ganism (zygote as well as grown organism) is specified, not by one, but 
by two values, x and of this random variable which may or may not 
be equal to each other. They represent the organism’s maternal and 
paternal heritage, since it receives one allele from its mother and the 
other from its father. 

We assume that two organisms are genotypically the same with 
respect to the locus in question if in one of them x comes from the 
mother and y from the father, while in the other organism it is the 
other way round. If we denote the type of an organism by (xy), where 
the first letter denotes the maternal and the second the paternal heri¬ 
tage, this assumption reads, 

(1) {xy) = {yx). 

Consequently there are r(r+l)/2 possible genotypes in this case. If 
r=2 the three types are often denoted by {AA), (Aa), and {ad). Next 
we suppose distinct, rum-overlapping generations. A basic assumption 
is then that in a certain “initial” generation the posable t3iiies {xy), 
{x=ai, • • • , Or); (y~ai, • • • , a,) are distributed according to a 
law of probability. This distribution will be called the initial probability 
distribviion of genotypes, {xy). From it, by means of the hypotheses 
which characterize our problem, the distributions of genotypes for 
later generations will follow. We denote the distribution for the nth 
generation by {xy). In accordance with (1) we must assume that 

( 2 ) = w^”^{yx) (n = 0, 1, •••)»(* = ®i» — > Or 

\y 

There is no loss of generality if we suppose that the initial distribution 
of genotypes is the same for males and females. In fact, it is easily 
seen that under random mating any difference between initial distri¬ 
butions disappears in the first filial generation. For this distribution we 
have 

L Z w^”^{xy) = 1 , (n = 0, 1, • • •), f* = oi, • • • , Or\ 

SB y \y / 

Next Mendel assumed that in the formation of a new individual the 



MENDEIiIAN QBimnCS 


529 


parent (through the gamete) transmits to the offspring one of its two 
genes, either x or y. The choice between the two values happens ac¬ 
cording to a probability law, Mendel’s assumption, deduced from and 
confumed by observation, is that the two proboMlities for the segregation 
of either of the two genes are equal. With a view to more general cases, 
we introduce a second basic distribution, the segregaiion distnbtdion. 
Denote by U the probability that the paternal gene be transmitted, by 
li the analogous probability for the maternal gene. Then, in the par¬ 
ticular case under consideration 

(4) lo + = 1, lo = i 

The segregation distribution is not so trivial in more general situations. 
We shall however retain two of the assumptions made here: (a) The 
segregation distribution is independent of it does not change 
through the generations, (b) It does not depend on the genotype of the 
parent. We shall see that the segregation distribution plays a basic 
role wherever random mating is considered. Certain problems where 
selection is involved cannot be described in terms of the segregation 
distribution. (See section 2.) 

From these two distributions we derive the third important distribu¬ 
tion, the distribution of gametes, pW (a:), (ra=0, 1, • • • )» (®=®i> * • • > 
Or). This is the probability that the gamete be of constitution x, i.e. 
possesses the gene x. Let us compute this distribution. In order to 
transmit the gene x the parent must possess this gene and transmit it. 
Accordingly we have for example for the gene a;= 1, (writing 1, 2, • • • , 
r instead of Oi, 02 , • • • , a,), p<">(l)=wW(ll)(Zo-|-ii)-|-«^”Kl2)li+to‘"^ 
(21)lo+ • • • This formula is easily understood. 

For example, the term is the probability that the parent 

possesses the maternal gene “1” and the paternal gene “2,” multiplied 
by the probability, h, that the transmitted gene be the maternal gene. 
Obviously the sum of all such terms, as contained on the right side of 
the preceding formula, gives the probability of a gamete of type “1,” 
(if we assume certain circumstances which we shall analyze presently). 
Because of (2) and (4) the formula may be written as follows 

1 • • T 

(6) pf*)(a:) = to^*’(*y) (a: =» 1, • • • , r, » = 0,1, • • • ) 

v 

with 


E P‘*’(*) “ 1- 


(50 



530 A2CBBICAN STATISTICAL ASSOCIATION JOURNAL, DECNMBEB 1949 

A new individual of the (n+l)st generation is formed by the fusion 
of two gametes of the nth generation. The assumption that this fusion 
happens at random is expressed in the formula: 

(6) — pW(a:)pW(y) I = 1, 2, • • • , r 

Vy 

and that finishes the cycle, since we now know the distribu¬ 

tion of genotypes in the next generation. 

Before analyzing the assumptions which imderlie these formulas I 
want to derive the famous law of constancy of gametic proportions, first 
recognized by the biologist W. Weinbei'g [26] and proved by the 
mathematician G. H. Hardy [lO]. We obtmn from (5) and (6): 

p(n+i)(3;) = 2 «)<“+« (ip) 

V 

(7) l—r l-..r 

y y 

“ (n = 0, 1, • • • ). 

Prom this it follows that 

(70 = w^^^xy), (w = 1, 2, • • • ). 

These basic results state that the distribution of gametes remains con¬ 
stant throughout, while the distribution of genotypes remains constant from 
the first filial generation on. These results hold in this form in our “sim¬ 
plest case” only. In more complicated problems the first result must, 
in general, be modified and the second fails. 

2. REMARKS ON “RANDOM MATING” 

Let us try to give a definition, or rather a mathematical explanation, 
of the concept of random mating as opposed to “selection.” I do not 
propose to discuss the meaning in probability of the concepts “random” 
or “randomness” or “random variable,” which are inherent in any 
probability theory. Although there is much controversy with respect to 
these notions, every statistician attributes a certain meaning to them. 

It seems to me that the mathematical meaning of “random mating” 
is contained in the equations (5) and (6) by which toCny^i) jg derived 
from In an equation of type (6), the distribution of gametes in 
the Tith generation appears as a linear expression in the w^^'^ipoy), the 
values of the distribution of zygotes, with constant coefficients which 



mendelian genetics 


531 


are sums of values of the segregation distribution. An equation of type 
(6) expresses the random fusion of the two gametes. In the various cases 
of selection and mutation one or more of the assumptions which lead to 
(6) and (6) cannot he maintained. We shall see, for example, that these 
equations contain the assumption that within the population there are 
no genotypic differences with respect to the attainment of maturity, to 
fecundity, or to mortality (no matter whether such differences are 
“natural” or “planned”). Another assumption contained in our equar- 
tions is that the choice of the mate happens at random. Two examples 
will illustrate these points. 

Consider, as in section 1, the “simplest case” of one locus and assume 
r=2 alleles. Also suppose that the choice of the mate is still due to 
chance, but that there is a differential viability for the three genotypes. 
How do we express this mathematically? 

Write A and a for the two alleles and introduce for the sake of brevity 

= pn, w^^'^iAa) = gn, w^^\aa) = rn. 

These Pn, ffn, constitute the distribution of zygotes as generated by 
their parents. In case of a different viability of the types we can no 
longer state that the distribution of genotypes at the time of maturity 
is the same as at time of birth. We have to introduce a different distribu¬ 
tion, Pn', gn', rn', the distribution of the parents of the next generation. 
Various types of selection as indicated above may be accounted for in 
this way. We define Pn', gn', Tn by means of “selection coefficients” 
Pi 7 (only the proportions of these three numbers matter) 
p/:2gn':r„' = apn:2j3gn:7r», with Pn'+2gn'+rn'=l. The distribution 
of gametes is now deduced from the Pn', g»', rn, rather than from the 
Pn, Qn, rn. Writing Xn and yn instead of p^")(A) and p^“>(a) the formula 
which takes the place of (5) is: 

~ P» "i" ffn', Vn ^ ^n' “1“ ^n'. 

Although (5) no longer holds, the rule expressed in (6) is still valid: 

Pn+l = Xn^ g«+l = XnVn, CtC. 

Mathematically this is a problem of a different character, with essenti¬ 
ally different results from those of the problem of section 1. 

We quote a second example where the random choice of the mate no 
longer applies and differential viability or an equivalent condition is 
not assumed. The strongest deviation from the concept of random 
choice of the partner presents itself if the genotype of the mate is 
uniquely determined by the individual's own genotype. Assume that 



532 AMSBICAN STATIBTICAIi ASSOCIATION JOUBNAL, DECEMBER 1949 

there exists the rigorous ^rule” (natural or artificial) that only identical 
t 3 T)es may mate. Then there are no longer six but only three possible 
t 3 T)es of m a tin g s: AAX-AA, AaXAa, and aaXaa. With our original 
notation we have: 

The third equation is similar to the first, with A and a interchanged. 
These equations replace (5) md (6). To understand the new situation we 
write for the right side of the first of these equations: 

Consider e.g. the second of these two terms. Here 
2w<">(Aa)=ii)^"^(Aa)+«)f"’(aA) is the probability of a female of type 
(Aa). To get the probability of the mating AaXAa we no longer multi¬ 
ply 2ioW(Ao) by 2«)f”> (j4.a), as we would in random mating, but by “1,” 
which represents the conditional probability that imder the considered 
“law” the male partner be of type Aa if the female is of type Aa. Hence 
2ir(")(ila) is the probability of the mating AaXAa. Finally i stands for 
the probability that the ofispiiag of the mating AaXAa be of type AA. 
More generally, we may denote by vn, the probability of a union of the 
t 3 q)es K and X and by pa** the probability that the offspring of a mating 
(k, X) be of tjrpe p. We assume that for all jc, X: pa"=px/ and YlitPol' ~ 1. 
Here pa" need not be, as in our example, the product of two segregation 
probabilities. It may depend on the types of both parents, whereas our 
segregation distribution refers to each parent separately and does not 
depend on the parental t 3 rpes (see sec. 4). 

We do not attempt to present a scheme which would apply to the 
most general case of “selection.” The preceding examples are intended 
to show how new situations call for new concepts. 

3. SOME MATHEMATICAL PROBLEMS IN MENDEUAN GENETICS 

We now consider some important generalizations of our “simplest 
problem.” In this simplest case there exists no “recurrence problem,” 
since after one good mixing the characteristic distributions do not 
change any more. This is not a typical result. However, the concepts 
introduced for this simplest case remain valid except for certain gen¬ 
eralizations. 

We continue to assume random mating. As an important example 
of a more general situation we first consider various characters of an 
individual dmultaneously, as e.g. color of the flower, length of the stem, 
and shape of the seed, etc. The genetic constitution of the individual 
is still determined by the two gametes which represent its maternal 



MUiTJPBIJAN GENEnCB 


533 


and paternal heritage. Each gamete, however, no longer consists of a 
ringle gene but of a set of genes. To denote a genotype let us separate by 
a semicolon the symbols which stand for its maternal and paternal 
heritage. Thus a genotype may be denoted by (a:; y) where the letter 
before (after) the semicolon Quotes the individual’s maternal (p£t- 
temal) heritage. Also introduce «<”>(*; y), the distribution of geno¬ 
types, and assume that, as in (1) and (2): 

(10 (*; v) = (»;«) 

and 

(20 y) = x). 

This differentiation with respect to the maternal and paternal heritage 
is of great importance for the understanding of linkage. It relates how¬ 
ever merely to the formation of new gametes. 

In the example under consideration where we are concerned with the 
study of m loci, stands for the “maternal” genes, Xi, • • - x„, while 
“y” symbolizes the m “paternal” genes, yi, • • • y^. The kinds of 
gametes which this organism may produce depend on the possible com¬ 
binations of the material it has inherited. These combinations happen 
according to a probability law which we again call the segregation dis¬ 
tribution. It is one of the main tasks of the theory to define a segrega¬ 
tion distribution which corresponds to given biological conditions. In 
a recent paper [5, d], R. A. Ksher says: “The laws of inheritance are 
the rules whereby, given the constitution of an organism, the kinds of 
gametes it can produce and their relative frequencies can be predicted.” 
Next, the distribution of gametes p<")(z) is derived from the distribution 
of genotypes by means of the segregation distribution. Finally, the 
random fusion of the two gametes, expressed in the formula 
w(n+i)(x; y)=p^”^(x)p^^'>(y), completes the cycle of inheritance. 

It is easy to indicate a few mathematical problems which present 
themselves in this and in similar situations: 

a) The possible genotypes are to be completely and simply enumer¬ 
ated under consideration of the specific biological conditions which 
characterize a given situation. The same problem exists for the various 
kinds of gametes. 

b) An adequate segregation distribution is to be defined. 

In a), as well as in b), the mathematical approach can help to simplify 
and clarify the concepts. As in other sciences the actual situation in 
natiue is often so complex that “models” have to be constructed rep¬ 
resenting a compromise between mathematical rimplicity and biologir 



534 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

cal adequacy. By such a procedure we introduce for example in me¬ 
chanics models like the “rigid body,^ the “elastic body,^’ the “ideal 
fluid,” etc. 

c) The distribution of gametes, is to be derived from y) 

by means of the segregation distribution, and recurrence relations for the 

are desired. They are in general, simpler than those for the 
y)] under random mating both are equivalent. 

d) We wish to integrate these recurrence equations which are in 
general, non linear difference equations with constant coefficients; that 
means we want to express in terms of “w,” of the initial values 

—^which in turn have to be derived from the 2 /),—and of 

the parameters involved in the definition of the segregation distribu¬ 
tion. 

e) We want to know whether one or more states of equilibrium pre¬ 
vail for p^”^{z) and for y) and if so under what conditions. To 

settle this we also have to investigate the limit behaviour of and 

as “n” tends towards infinity. 

In the following we shall comment on the problems a)-e) in the case 
of: 1) m hci (linkage), assuming autosomal genes and “diploids”; 2) so- 
caUed autopolyploids (as opposed to “diploids”) for one as well as for 
several loci. 

4. LINEAGE IN THE CASE OF m LOCI AND r ALLELES 

As in the first example of the preceding section assume m loci. 
Menders original assumption was that all possible gametic combina¬ 
tions are equally probable. This assumption of independent assortment 
does not introduce any parameters since each of the 2^ possible combi¬ 
nations appears with probability 1/2®*.^ 

This simple conception was shaken by the observation of “linkage” 
associated with the names of Bateson and Punnet, of Morgan, Sturte- 
vant, and of many other well known biologists of our time. It appeared 
that not all possible gametes occur with equal frequency. Consider for a 
moment the case m=2, and r alleles. We denote a type by (xix^] 2 / 12 / 2 ) 
where Xi as weU as 2/», (i= 1, 2), relate to the ith locus and each of them 
takes on the values 1, 2, • • • r. The four possible types of gametes are 
then a;ia; 2 , yiX 2 , P 12 / 2 , since one gene is transmitted with respect to 
each locus. In MendePs conception each of these four possibilities 
has the probability i. In modern genetics these four probabilities are 

1 It is referred to by many authors as 'random segregation.” This term should not be confused 
\nth the general use of the term 'random” underlying each probability distribution and not mereily 
the particular one where all probabilities are equal. Such a probability distribution is often called 'uni- 
foxm,” in probability calculus. 




MBNDBIJAN GENETICS 535 

assumed to be (1—c)/2, c/2, c/2, (1—c)/2, where c is the so-called 
“recombination value.” It is often assumed by biologists that c is less 
than This implies that the genes that came in together exhibit a 
tendency to stay together; mathematically, however, c may have any 
value between zero and one. The introduction of the parameter c corre¬ 
sponds to assuming a certain influence of the grand-parents. The case, 
m=r=2, has been completely investigated by Weinberg [25 ], Robbins 
[ 20 ] and Jennings [ 13 , b, c]. 

In the general case there are m(m— 1)/2 recombination values, Cy. 
These may be defined in a way which is independent of additional as¬ 
sumptions like “chiasma theory,” or “linear theory”: c,-,- (i=l, • • • m, 
y=l, • • • m, tVi) is the probability that those transmitted genes 
whose subscripts are i and j come from different parents, or, in other 
words, the probability that either Xs and yj or xj and j/,- be transmitted, 
no matter what happens to the (m—2) other factors. It follows that 
is a marginal probability of our segregation distribution. It will be seen 
that, unless we admit some additional biological h3^thesis, the 
m(m—1)/2 recombination values are not sufficient for the description 
of general linkage. 

In case of m loci a genotype is characterized by two nhdimmsional 
vectors, X{ and y, (*= 1 , •••?», Xi=l, • • • r, y<=l, • • • r). This simple 
idea, absolutely basic in the author’s approach, is not generally ac¬ 
cepted. Often (the consideration being limited to two or three alleles) 
a type is simply denoted by m pairs of numbers, which is all right only 
if independent assortment is assumed. Then there are just nine types 
for m=r=2 and in general [r(r-fl)/ 2 ]"* types. For theimderstandingof 
linkage, however, the assumption of two sets, each of m numbers, is 
necessary. This amoxmts to distinguishing for example between (AB; 
ab) and (A6; oJB); in fact, these two types are different with respect to 
the segregation of gametes ii since the first transmits AB with 
probability (1—c)/2, and the second transmits AB with probability 
c/2. There are in this case ten and not nine types and in the general 
case there are types. 

Thus, a type is denoted by (xi, •••*»; yi, •• • ym). In the formation 
of a gamete a new set of m elements is composed in such a way that 
corresponding to each of the m subscripts either the »-value or the 
j^value is chosen. Consequently there are 2” possibilities. The gamete 
will consist of some of the Xi and of some of the yi. By fusion of this 
gamete with another, a zygote is formed which contains in a new com¬ 
bination some of the material found in the parents. This new combina¬ 
tion, again, will not persist: when the new being forms sex cells the 



636 AMEMCAN STATISTICAL ASSOCIATION JOUENAL, DECEMBER 1949 

new gamete will exhibit a new characteristic combination and the old 
combinations disappear. Thus the genes recur from generation to gen¬ 
eration passing through many individuals which they determine (with 
respect to certain properties). The combinations change, but not the 
constituent genes. (This is a very simplified scheme which does not 
even cover mutations.) 

To describe adequately the 2* possible assortments we introduce a 
segregation distribution which, in this case, may be called a linkage 
distnbidion. Denote by S the set of numbers, 1, 2, • • • m, by T any 
subset of <S, (P^T^S) and hyhOie probability that the maternal genes 
belonging to T and the paternal genes belonging to T'=S—T be trans¬ 
mitted. Note that this definition is indepen^nt of the genotype of the 
parent. There are 2“ such probabilities. For obvious symmetry reasons 
we must assume 

(7) It — It'. 

Hence, since the sum of all probabilities is one, we have introduced 

if = 2»-» - 1 

parameters. For m^4, if>7n(m—1)/2; so there are in general more 
linkage-parameters than recombination values. It is desirable to de¬ 
scribe linkage in terms of the m(m—1)/2 recombination values, that is, 
to express the M linkage parameters by these values. This problem is 
dosely connected with the so-called “linear arrangement” of the genes 
of the same linkage group and with the concept of “distance” of linked 
genes. R. A. Fisher [5, e] recently offered a very su^estive solution. 
An older solution, fairly generally accepted (but in the author's 
opinion open to ciitidsm) is due to Haldane [9, c]. Whatever the rela¬ 
tion between linkage distribution and recombination values, the linkage 
distribution must be assumed known if we wish to study the heredity 
problem of linked genes. 

6. SOLUTION OF TEE LINEAGE PROBLEM 

For the above linkage problem, the author has solved the problems 
indicated in section 3. We have considered a) the enumeration of the 
possible genotypes imder linkage and b) the definition of the linkage 
distribution. Before continuing, some important particular forms of the 
linkage distribution should be specified: 1) In the case of independent 
assortment (random segregation), the 2" values of the linkage distribu¬ 
tion are aU equal to each other, hence each equal to 1/2”. 2) If there are 



MENDEIXi-K OBNETICS 


537 


S distinct “linhage groups” them-dimensionallinkage distribution®r&- 
solves into the product of S probability distributions. 3) The m loci 
may form several groups of completely linked loci. Then, all recombinsr 
tion values within each group are ziero while all recombination values 
between members of any two different groups equal 
The mtun problem is the recurrence problem (problem c) of see. 3) 
which always occupies the central place. In the case of m arbitrarily 
linked (autosomal) genes it has been solved [7a], [l3b]. To explain 
the solution we need the concept of a “ marginal distribution,” well 
known to statisticians. If p(si, zj, • • • z„) denotes a discrete (arith¬ 
metical) probability distribution in m variates, then pij(ziZ 2 ) 
= P(2ii2aSa • • • Zm) is the probability of the result (ziz*) 

and similarly Paisaj), Pti^t) or PinizajHk) may be defined. Note that 
there are marginal distributions of “order 1,” of “order 2,” etc. Obvi¬ 
ously we may write Pt{st) for p<ji,(zjz/i) if T denotes the set (i, j, h) 
and zr the set (z,zjZi). If, finally, we write p<“>(z) for pW(ziZ 2 • • • z«) 
our recurrence formula reads 

(9) p<»+«(z) = ItPt'(zt)pt-,(zt') 

m 

where the sum is over all subsets T oi S and pr(zr) is the marginal 
distribution whose subscripts are the points of T. We have e.g. (antici¬ 
pating the result (11): 

m = 2: p('H-«( 2 iS 2 ) = 2 Z( 00 )p(“>(ziZ 2 ) + 2Z(01)pW>(z0p<®>(2j) 

m = 4: p<"+W(j!j 2 j 2 ^^) = 2{Z(0000)p^"^(ziZ2Z3Z4) 

+ [Z(1000)p(®’(zi)pW(zs!a84) 

+ • • • [Z(0001)p^®>(Z4)p<”J(ZiZ2Z8)] 

+ [Z(1100)P^*’(ZiZ2)P^"K2*®4) 

+ Z(1010)P<"KZiZ8)p<">(Z2Z4) + • ]}• 

In our recurrence formulas the v^ues of the linkage distribution act 
as essential “separators” between meaningful groups of probabilities. 
This clear and simple recurrence relation is characteristic for “random 
mating” and “chromosome segregation” (see also next section). 

The next problem consists in the solution of the recurrence equations. 

* A definition of the linkage distribution, completely equivalent to the previous one, is the follow¬ 
ing: There are 2” probabilities with sum one,!(«, ei, • • • sm) where ei* equals either zero or one. *1 
means that the transmitted gene corresponding to the ith factor is the maternal one while cj a>0 means 
that the paternal value has been transmitted. Instead of (7) we have 

»i(l 1 -ft, • • * »1 —«w) 



638 AMSBICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

These form a system of quadratic difference equations with constant co^ 
efficients. It has been shown that this solution can be found [7a]; it is 
however no simple matter so long as everything remains unspecified. 
In the case of independent assortment a very elegant solution has been 
given by H. Tietze [24] who established the recurrence relations for 
this case and investigated the limit behaviour in full generality. Jen¬ 
nings [13, b, c] and Robbins [20] have solved the general linkage prob¬ 
lem for m=2, where one parameter enters, including an explicit expres¬ 
sion for In the case m=3, where three parameters are involved, 
the explicit solution given by the author [7b] is still very simple. 

Unless we have complete linkage (Z(l, • • •, 1) = Z(0, • • •, 0)=J, all 
other linkage probabilities zero) there is no equilibrium for finite n. The 
investigation of the limit behaviour of as n—>«> becomes im¬ 

portant. The author has established [7a] general results which may be 
formulated in the following statements: 

First it is shown that, just as in the case m= 1, the gametic proportion 
of each single gene remains constant through the generations 

(11) p^^Kzt) - p^^Kzi) (i = 1, 2, • • • , n = 0, 1, • • • )• 

This is the mathematical expression of the “immortality” of the genes. 
Next, consider the joint distribution p^^'^iz). There are two important 
particular cases which have to be settled first. Assume Z(00 • • • 0) = |; 
then all recombination values equal zero, or, in other woids, all “mixed” 
gametes, containing partly paternal and partly maternal genes, are a 
priori excluded. We then have for all ni 

(12) p^^^iZi • • • Zm) ^ P^^KZl • * • Zm). 

Completely linked genes act like one. The other extreme is that no recowr 
bination value equals zero. Then it follows that 

(13) lim p«")( 2 i • • • 2 «,) = p^®>( 2 ;i) • • • p^^\Zn,). 

n-*eo 

In Mendel-Tietze's [24] case of independent assortment eachc<y=J, 
hence the above condition is satisfied and (13) holds. 

Finally, if Cty>0 does not hold for all pairs i, j, the linkage distribu¬ 
tion degenerates in various ways into groups of complete linkage. 
By that we mean a set of genes (i, j, • • • , fc) within which no recombina¬ 
tion takes place, i.e. all corresponding c</ are zero. For example if (1,3, 6) 
is such a set, it can be shown that the marginal distribution pue^”^ i^ 
preserved through the generations: holds for all n, be¬ 

sides (11). To describe the limit behaviour completely "we use the term: 



MENDELIAN GENETICS 


539 

maximal set of completely linked genes for a subset T of S containing as 
many as possible of the m numbers with no recombination within the set. 
Assume e.g. w=8, and that the group (1, 4) as well as the group 
(2, 3, 6, 6) are each completely linked, while the recombination values 
Ci 2 , Ci 7 , Cisj C27, and C 28 are known to be different from zero. Then (14), 
(2356), (7), (8) are the maximal sets of completely linked genes and, in 
an obvious notation: 

(14) lim pn .. 

n—>00 

or, more generally [7, b]: 

If Si, •••, St are the maximal sets of completely linked characters 
{t^m), each containing at least one element, then 

(15) lim pi 2 .. • • • Pst^^K 

The general theorem may be expressed in a more complete way: 

Under the conditions of our investigation (m arbitrarily linked genes, 
random mating, etc.): For the gene distribution in successive generations 
the original (marginal) distributions of each ^maximal groups S% are 
preserved: = for all n; as n—»«>, the joint distribution 

of the m random variables approaches a limit where the different sets of 
genes are independently distributed. 

It is obvious that from these results corresponding results for the 
24 )C«)(a;; y) follow easily, by means of the equation which is the analogue 
of (6). 

It is biologically important and mathematically interesting to con¬ 
sider the extension where different linkage distributions for males and 
females are assumed. At the suggestion of S. Wright, who considered 
the problem for m=2 and m—3, the author has investigated this ex¬ 
tension [7, d], assuming two linkage distributions I and V. It turns out 
that the recurrence relations change, but not in an essential way: the 
arithmetic means of the values of the two linkage distributions play a 
decisive role. Tlicre appear now two different distributions of gametes 
corresponding to the two sexes, and simple recurrence relations hold for 
their arithmetical mean. As n—the two gametic distributions ap¬ 
proach the same limit which is independent of the linkage distributions 
and where the m genes are independently distributed. 

6. THEORETICAL GENETICS OP AITTOPOLTPLOIDS 

Thus far we have discussed the extension of Mendel's original ideas 



540 


AMBBICAN STATISTICAL ASSOCIATION JOIIBNAL, DECBMBEB 1S49 


to the case of linkage. A second remarkable extension has come with 
the discovery of polysomic inheritance, as opposed to normal or diploid 
inheritance. We shall discuss the important and interesting case of 
autopolyploids. This problem has been studied by numerous biologists. 
Among them we mention, because of the more theoretical aspect of 
their work, R. P. Gregory, H. J. Muller, J. B. S. Haldane [ 2 ], [9 e], 
S. Wright [26, c], K. Mather [16, b, c], [ 6 ], and R. A. fl^er [ 6 , b, 
c, d]. 

In the preceding pages we assumed “chromosome segregation” as 
opposed to “chromatid segregation.” It can be shown easily that for 
normal or diploid organisms, as considered so far, there is, mathe¬ 
matically, no difference between chromosome segregation and chroma¬ 
tid segregation. Thus the results of section 5 are general. This situation 
changes for polyploids, where there is a definite difference between the 
two modes of segregation. Chromosome segregation may be considered 
as an approximation to chromatid segregation; this latter one actually 
prevails according to modem studies. Nevertheless we shall discuss 
mainly chromosome segregation, since even under this simpler assump¬ 
tion the problem of polyploids appears rather complicated and un¬ 
familiar to the statistician. 

In the simplest case where only one locus is considered, an autopoly¬ 
ploid organism may be described as follows: In case of a 2 s ploid each 
gamete consists not of one but of genes. A genotype possesses two 
sets, each ofs genes, each gene being represented by one of the r numbers 
1 , 2 , • • • , r, the r alleles, where r may be greater than, less than, or 
equal to s. Although a genotype is mathematically described by two 
sets, each of s numbers, there are less than N=r^ distinct types since 
many types have to be considered as equal. First, we have as before: 
(x; y) = (y; ®). Moreover, it is assumed with respect to the maternal 
(paternal) heritage that there is no difference between the various 
permutations of the s numbers which constitute this heritage. In ac¬ 
cordance with this, a type is denoted, e.g. for s= 6 , by (oiW: 01040 * 07 *). 
It is then easily seen that there are 


(16) 


Ni = |R(R -}- 1 ) genotypes where R 



With respect to segregation the results of observations suggest the 
following: In the formation of a new individual each parent transmits 
to the offspring a set of s genes out of the 2 s genes the parent possesses. 
The selection of these s genes happens according to a probability dis- 



MENBSUAN QENETICS 


541 


tribution called the segregation distribution, which need not be the 
same for males and females. We assume here, however, that there is 
one and the same segregation distribution for both sexes. Moreover, 
the segregation distribution is independent of the genot]^ of the par¬ 
ent and of n. Let us consider its definition. Out of 2a numbers a set of 
s numbers can be selected in S ways, where 

2 s\ (2s) 1 
s / s!s! 

These 8 cases (there are six in case of the most often considered tetra- 
ploid, s=2) have been more or less tacitly assumed equally probable, 
first by Muller, then, as far as I know, by all biologists concerned with 
the problem. This assumption is not logically necessary and does not 
seem inevitable in the light of reported observations. We therefore in¬ 
troduce a segregation distribution where the 8 probabihties are not a 
priori assiuned equal. On the other hand we must avoid introducing 
parameters that are not meaningful biologically. After conMderii^ the 
analogy to other linkage phenomena and studying the numerical re¬ 
sults of observations, the author [7c] was led to the hypothesis that 
within the s transmitted genes the proportion of maternal and paternal 
genes plays a certain role. We thus make the following definition: Call 

V« (« = 0, 1, • • • , s) 

a/ 




the probability that a specified set of a maternal genes be transmitted. 
Assume symmetry of paternal and maternal heritage: 

(18) la “ lt—a‘ 


Since a set of a specified maternal genes may be combined with (s—a) 
paternal genes in (J^ yrays, we have 


(19) 




1 . 


Thus if 8=2/* or 2/»-^l, just p parameters are introduced. These p 
parameters may be partly or all equal to each other. K X«=1/S we have 
“random chromosome segregation” (Muller, Haldane). Other particu¬ 
lar assumptions may be considered. 

Although there is a certain apparent sixuilarity between this mathe¬ 
matical formulation and that in the linkage problem, the situations 
are in fact quite different. In the polyploid problem, any « maternal 



542 


AMBBICAN STATISTICAL ASSOCIATION JOtTBNAL, DBCEMBEB 1M9 


genes may be segregated together with any (s—«) paternal ones, while 
in the linkage problem the new set of m genes has to contain exactly 
one value, either the Xi or the y,-, with respect to each of the m sub¬ 
scripts, 1, 2, • • • , TO. 

Our three basic distributions, the distributions of genotypes and of 
gametes and the segregation distribution have thus been defined. The 
next step, and the most important one, consists in finding a general 
recurrence formula which permits us to derive step by step the for 
«= 1,2, • • •, starting with It turns out that there exists for poly¬ 
ploids too a surprisii^y simple recurrence relation. It is based on the 
above concepts and on the use of certain types of “marginal distribu¬ 
tions” whose definition is much less obvious than in the linkage prob¬ 
lem; this recurrence law holds for any “r” and any “s.” For a homo- 
zygotic gamete of type (A‘) our recurrence formula is simply 

(20) p«(^*) = ia- ( * ) 

The and are marginal distributions defined in agree¬ 

ment with the general concept of a marginal distribution. For example 
p(n)(^«) ig probability of a gamete with a A-genes, no matter what 
its remaining ( 5 —a) genes may be. If we consider heterozygotes, r dif¬ 
ferent allelomorphs, where r|s, so that a gamete is of type 
(ai*i • • • Qr^r) with xi+ • • • +Xr=s, we obtain a result which is not 
much more complicated than (20). 

By means of these recurrence relations the author [7c] has derived a 
limit theorem, as n—»oo. Haldane [9e] has indicated a distribution 
which reproduces itself under random segregation and thus represents 
a state of equilibrium. This result leaves open the biological and mathe¬ 
matical question of whether such an equilibrium is actually reached, 
and if so, under what conditions. Our limit theorem may be formulated 
as follows: 

Denote the r alleles ftp oi, • • • , ar, and consider one locus, and chro¬ 
mosome segregation. Then 

1) As in case of a diploid (sec. 1) the gametic proportions corresponding 
to each single allele remain unchanged through the generations, p^"^(a») 
=pCo>(a<), (i=l, • - • , r, n=0, 1, • • • ,). 

2) If and only if io<i (i.«. if ^mixed gametes” are not a priori ex¬ 
cluded) the joint distribvMon of gametes converges towards a limit where 
the alleles are independently distributed. 

(21) lim p<“>(ai*i • • ■ Or*^) = [p^®K®i)l"^ “ • • 



MENDEUAN GENETICS 


543 


In the case of Haldane’s “random segregation” (where all segregation 
probabilities are equal to each other), the condition of our limit 
theorem is satisfied and our result thus proves and completes his state¬ 
ment. 

While chromatid segregation describes the biological situation more 
correctly than chromosome segregation it seems that this latter, with 
an appropriate general segregation distribution, presents a useful ap¬ 
proximation. It would lead us too far to attempt an explanation of 
chromatid segregation. A few remarks must suffice. Chromosome seg¬ 
regation is a particular case of chromatid segregation, where certain 
probabilities, corresponding to “double reduction,” are assumed to 
equal zero. A segregation distribution for chromatid segregation has 
been introduced by K. Mather and R. A. Fisher [6]. This basic ap¬ 
proach has been generalized by the author [7, f] so as to contain as 
particular cases Fisher-Mather’s segregation distribution [6], Hal¬ 
dane’s “random chromatid segregation” [9, e], and my general chro¬ 
mosome segregation distribution [7, c]. These investigations are for 
8=2, 3, and 4. They include (according to the program outlined in 
section 3) recurrence relations; solution of those difference equations; 
and limit-resuUs. It is worth mentioning that in this case the recurrence 
relations, as well as the limit theorems, are essentially different —and 
this in a very interesting way—^from the results (20) and (21). While 
these last are of the same type as the corresponding results (9) and (13), 
the problems of chromatid segregation introduce an entirely new type 
of statistico-biological laws. 

The next step leads to the study of ‘polyploids under consideration of 
linkage (several loci). In a particular case (data from tetraploid primu¬ 
las), DeWinton and Haldane [2] have proposed a linkage theory for 
tetraploids imdcr chromosome segregation. A suggestively simple seg¬ 
regation distribution is introduced on the basis of evidence which indi¬ 
cated a certain “pairing” as prevailing in this plant. In a more recent 
paper Fisher [5, d] considers linkage under chromatid segregation. His 
paper is concerned with the enumeration of gametes and of genotypes 
and to a certain extent with the definition of a segregation distribution, 
problems which in this general form are far from easy. 

The formulation of the problem of 28-ploids with m loci must include 
the problem of diploids for m loci (sections 4 and 5) as well as that of a 
28-ploid for one locus (section 6). The general problem must reduce to 
these problems and respective results if either s= 1, or m= 1. Further, 
the case of a 28i-ploid with mi loci (si^s, mi^m) must appear as a 
“piarginal” case of the more general one. Accordin^y, I introduce [7, e] 



544 AMBRinAT f STATISTICAIi ASSOCIATION JOITEtNAIi, DECEMBEB 1949 

a sufficiently general segregation distribution. Recurrence formulas are 
also derived and the limit behaviour of the characteristic distributions 
is completely investigated. The results are simple and complete. It 
seems that these structurally clear recurrence formulas describe rele¬ 
vant features of a rather general biological situation: 2a-ploid, m loci, 
r alleles, random mating, chromosome segregation,—with Mendel’s 
theory of heredity as the basis. A similar remark holds for the limit 
theorems [7f]. 

The study of linkage of polyploids under chromaiid segregcdion seems 
to be very difficult. The aim is, of course, to study the problem by 
means of a segregation distribution which is well adapted to the availa¬ 
ble data and which satisfies the theoretical requirements. With such a 
distribution at the basis we want to obtain a fairly complete insight 
into the theoretical side of the problem, including recurrence relations, 
stability, limit behaviour, and rate of approach to equilibrium. The 
conclusions reached so far for m = 1 leave no doubt that we have to 
expect an entirely new t 3 rpe of results. 

7. SOME OPEN PROBLEMS 

The mathematical description of polsrploid linkage under chromatid 
segregation constitutes an example of an open problem. There are 
many open problems within the limits set by the title of this paper. Let 
us mention a few ex;amples. 

The genetics of autopolyploids should be investigated under the as¬ 
sumption of different segregation distributions for the two sexes. In 
the vast domain of selection problems the results are still incomplete 
in many instances, even from a purely mathematical point of view. As 
a very simple example consider the first of the selection problems de¬ 
scribed in section 2. Here, a general integration has not been given in 
case of arbitrary selection coefficients; the limit behaviour of the distri¬ 
butions, however, can be completely described (cf. in particular Hal¬ 
dane’s [9d] basic work on selection). If we assume different selection co¬ 
efficients for the two sexes, then, to the author’s knowledge, not even 
the limit behaviour has been investigated for the general case. The gen¬ 
eralization from two to r alleles, almost trivial in case of random mat¬ 
ing, introduces considerable difficulties in selection problems. Linkage 
imder differential-viability-selection as well as polysomic inheritance 
are not easy problems. There are similar open problems for the vari¬ 
ous “systems of mating” which involve some choice of the mate. 
These examples have been chosen such as not to require the explana¬ 
tion of new biological situations. Other mathematical problems arise in 



MENDEUAN GENETICS 


545 


connection with so-called “sex-linked” inheritance in contrast to “auto¬ 
somal” inheritance which has been considered throughout this paper. 
If the assumption of “non-overlapping generations” (see section 1) is 
dropped we are faced with completely new and quite dij£cult problems. 

The results reported here and the problems explained may appear, 
even to the mathematically-minded biologist, as rather form^ and 
somehow remote from his held of interest. Modern genetics is an ex¬ 
tremely complex science, with relations to various fields of knowledge, 
such as general biology, biophysics and biochemistry, general anthro¬ 
pology, physiology, and psychology. “Formal genetics” as considered 
in this paper represents only one side of a manysided problem. We 
need not however lose sight of the totality of a problem if we follow up 
thoroughly only one aspect. Modem physics began when Galileo per¬ 
formed in a scientific way some very simple experiments and described 
them in mathematical terms; modem genetics started with MendeFs 
exact observations of simple phenomena and their mathematical de¬ 
scription. Today, in biology as well as in other branches of knowledge, 
the mathematical approach hardly needs justification. Nevertheless the 
biologist may sometimes wonder whether the generalizations which the 
mathematician uses are of any value to him. Each kind of approach has 
its inherent logic and must abide by its own inner laws. The mathe¬ 
matical approach to biological problems, by way of abstraction and 
generalization, will prove rewarding at present and in the future. 

SELECTED REFERENCES 

[1] F. Bernstein 

(a) "Variations- und Erblichkeitssiatistik/ Sandbuch der Yererhungs^ 
wissenschaft, Bd. 1, pp. 1-96. 

(b) "Zusammenfassende Betrachtungen tiber die erblichen Blutstrukturen 
des Menscben,” Zs.f, Induktive Abstammungs u. vererbungskhre^ Vol. 37, 
(1925). 

[2] D. Dc Winton and J. B. S. B[aldane, "Linkage in the tetraploid Primula 

Sinensis,” Journal of Genetics, Vol. 24 (1931), pp. 1-44. 

[3] Th. Dobzhansky, Genetics and the Origin of Species, Columbia Univ. Press, 

(1938). 

[4] Th. Dobzhansky and L. C. Duim, Heredity, Race, and Society, Penguin Books, 

New York, 1946. 

[5] R. A. Fisher 

(a) The Genetical Theory of Natural Selection, Oxford Press, 1930. 

(b) "The theoretical consequences of polyploid inheritance for the third 
style form of Lythrum salicaria,” Annuls of Eugenics, Vol. 11 (1941), 
pp. 31-39. 

(c) "Allowance for double reduction in the calculation of genotype frequen¬ 
cies with polysomic inheritance, Anwils of Eugenics, Vol. 12 (1944), 
pp. 169-171. 



646 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

(d) “The theory of linkage in polysomic inheritance/ Phil. Tram. Roy. 
Soc. London, B, 233 (1947), pp. 65-87. 

(e) “A quantitative theory of genetic recombination and chiasma forma¬ 
tion/ Biometrics, Vol. 4 (1948), pp. 1-13. 

[6] R. A. Fisher and K. Mather, “The inheritance of style length in Lythrum 

Salicaria,” Annals of Eugenics, Vol. 12 (1943), pp. 1-23. 

[7] H. Geiringer 

(a) “On the probability theory of linkage in Mendelian heredity,” Annals 
of Mathematical Statistics, Vol. 16 (1944), pp. 25-57. 

(b) “Further remarks on linkage in Mendelian heredity,” Ibidem, Vol. 16 
(1945), pp. 390-398. 

(c) “Contribution to the heredity theory of multivalents, ” Journal of 
Mathematics and Physics, Vol. 26 (1948), pp. 246-278. 

(d) “On the mathematics of random mating in case of different recombina¬ 
tion values for males and females,” Genetics, 33 (1948), pp. 548-564. 

(e) “Contribution to the linkage theory of autopolyploids,” I and II, 
Bulletin Mathem. Biophysics Vol. 11, (1949), I, pp. 59-82, and II, 
pp. 197-219. 

(f) “Chromatid segregation of tetraploids and hexaploids,” forthcoming 
Genetics, 34 (1949). 

[8] H. Geppert and S. Roller, Erhmafhematik. Leipzig 1938. 

[9] J. B. S. Haldane 

(a) The Causes of Evolution, London, New York, 1932. 

(b) New Paths in Genetics, London, 1941. 

(c) “The combination of linkage values and the calculation of distances be¬ 
tween the loci of linked factors,” Journal of Genetics, Vol. 8 (1919), pp. 
299-308. 

(d) “A mathematical theory of natural and artificial selection. Transactions 
of the Cambridge Philosophical Society, Vol. 23 (1924), pp. 19-41. Suc¬ 
cessive parts in later volumes: Proceedings, Cambridge Philosophical 
Society, 23 (1927), pp. 363-372, 607-615, 838-848, ibidem, 26 (1930), 
pp. 220-230, 27 (1931), pp. 131-142, 28 (1932), pp. 244-248. 

(e) “Theoretical genetics of autopolyploids,” Journal of Genetics, Vol. 22 
(1930), pp. 359-372. 

(f) “The rate of spontaneous mutation of a human gene,” Journal of 
Genetics, Vol. 31 (1935). 

(g) “Some problems of mathematical biology,” Jour, of Mathematics and 
Physics, Vol. 14 (1935), pp. 125-136. 

[10} G. H. Hardy, “Mendelian proportions in a mixed population,” Science, 

Vol. 28 (1908), pp. 49-50. 

[11] A. Heilbronn and K. Kosswig, “Principia Genetica,” The Journal of Unified 

Science, Vol. 8 (1939), pp. 229-255. 

[12] L. Hogben, An Introduction to Mathematical Genetics, W. W. Norton, New 

York, 1946. 

[13] H. S. Jennings 

(a) “The numerical results of diverse systems of breeding,” Genetics, Vol. 1 
(1916), pp. 63-89. 

(b) “The numerical results of diverse systems of breeding with respect to 
two pairs of characters etc. etc.,” Genetics, Vol. 2 (1917), pp. 97-154. 



MENDELIAN GENETICS 547 

(c) “The numerical relations in the crossing over of the genes etc.i etc.,” 
Geneticsf Vol. 8 (1923), pp. 393-457. 

[14] W. Johannsen, Elemente der exacten ErhUchkeitsUhre. 2. Auflage, Jena, 1913. 

[15] V. A. Kostitzin, Biologic Math4inatigue. (Pr4face de Vito Volterra) Paris, 
1937. 

[16] K. Mather 

(a) The Measurement of Linkage in Heredity, London, 1938. 

(b) “Reductional and cquational separation of the chromosomes in bivalents 
and multivalents,” Journal of Genetics, Vol. 30 (1935), pp. 53-77. 

(c) “Segregation and linkage in autotetraploids,” Ibidem, Vol. 32 (1936), 
pp. 287-314. 

[17] G. Mendel, “Versuche fiber Pflanzenhybriden,” Verhandlg, d. Natmfor- 
schenden Ver, in Briinn, Abhandlungen, IV. Bd., Briinn, 1866, pp. 3-47. 

[18] T. H. Morgan, The Theory of the Gene, New Haven, 1928. 

[19] K. Pearson, “On a generalized theory of alternative inheritance with special 
reference to MendePs law,” Phil. Trans. Roy. Soc. (A), Vol. 203 (1904), pp. 
53-86. 

[20] R. B. Robbins, “Applications of mathematics to breeding problems II, and 
III,” Genetics, Vol. 3 (1918), pp. 73-92 and pp. 375-389. 

[21] F. W. Sansome, “Chromatid segregation in Solanum Lycopersicum, Journal 
of Genetics, Vol. 27 (1933), pp. 105-132. 

[22] F. W. Sansome and J. Philp, Recent Advances in Plant Genetics, Sec. Ed., 
London, 1939. 

[23] L. H. Snyder and C. W. Cotterman, “Studies on Human Inheritance XVII, 
Gene frequency analysis etc.,” Genetica, Vol. 19 (1937), pp.%37-552 

[24] II. Tietze, “Vber das Schicksal gemischtcr Populationen nach den Mendel- 
schen Vererbungsgesetzen,” Zs. Angew. Math. a. Mech., Bd. 3 (1923), pp. 
362-393. 

[25] W. Weinberg, “Cber Vererbungsgesetze beim Menschen,” Zs. f. Induht. 
Abstammungs- und VererbungsUhre, Vol. 1 (1909), pp. 277-330. 

[26] S. Wright 

(a) “Systems of mating. I-V,” Genetics, Vol. 6 (1921), pp. 111-178. 

(b) “Evolution in Mendelian populations,” Genetics, Vol. 16 (1931) pp. 97- 
159. 

(c) “The distribution of gene frequencies in populations of autopolyploids,” 
Proc. Nat. Acad. Sciences, Vol. 24 (1938), pp. 372-377. 

(d) “Statistical genetics and evolution,” Bulletin of the American Math* 
Soc., Vol. 48 (1942), pp. 223-246. 

(e) “Isolation by distance under diverse systems of mating,” Genetics, Vol. 
31 (1946), pp. 39-59. 

(f) “On the roles of directed and random changes in gene frequency in the 
genetics of populations.” Evolution, Vol. II, (1948), pp. 279-294. 



THE FITTING OF LOGISTIC CURVES BY MEANS 
OF A NOMOGRAPH 


Eugbne a. Rasob 

Federal Security Agency, Social Security Administration 

“Growth” curves, such as logistic curves, are widely used. 
This article discusses a previous paper on this subject and 
shows how simpler nomographs for fitting the logistic curve 
may be developed. First, a simpler nomograph of the same type 
as in the previous article is presented, and then a different 
type is developed which has the advantage of practically no 
computations, simplicity of appearance, and a wider range of 
values. 

S PURR AND ARNOLD have showD how to fit a logistic curve quite 
simply by means of a nomograph.^ This paper will first present a 
nomograph for finding the upper asymptote determined in the same 
fashion as did Spurr and Arnold, but somewhat simpler in having two 
of the scales parallel, and in addition taking in more values. Next, this 
paper will develop a nomograph of a different type, by means of which 
the logistic curve may be more readily and more accurately fitted to 
three equidistant values. 

Chart 1 shows the nomograph for determining the upper limit of the 
logistic curve, 

y = fc/(l + 

with the three scales being yi/yo, yz/yij and h/y^. It will be noted that 
the latter two scales are parallel and cover roughly the same values as 
did the Spurr-Amold nomograph. However, the other scale shows a 
range of from 1.4 to 30.0, as contrasted with only 1.5 to 4.2 in the 
Spurr-Amold nomograph. Thus, not only is the nomograph in Chart 1 
more symmetrical, but also by utilizing the entire sheet it can cover 
more possible cases. 


* William A. Spun and David R. Arnoldt *A Short-Cut Method of Fitting a LogpLatio Curve,” 
Journal of the American StaiiaHeal AeaociaHon, March 1948. 


548 




550 AMERICAN STATISTICAL ASSOCIATION JOTJRNAIi, DECEMBER 1949 

This nomograph has been determined from the following determi¬ 
nant: 


0 


396 ^ 1 $ 

287^1* - 178A + 287 


4A(e - 4) 

41(0 - 1) ^ 

(7 - 2B) 

ZB " 

44(7A« - 20 A - 1 - 28) 
287A* - 178il + 287' 


' ' = 0 


1 


where A—y-ifyo, B=yi/yi and e=k/yo as in the article by Spun and 
Arnold; S and ju are the width and height respectively of the chart. 

Next, turning to the development of a simpler nomograph which will 
give accurate values of k, a, and b, directly from three equidistant 
values of y, without any preliminary calculations of the ratios of the 
y’s and subsequent multiplication, along with intermediate (or ex¬ 
tended) values of directly from the graph, we have the following de¬ 
velopment: 

For example, given the three equidistant values, yo, yi, and y^, of 

k 


let the problem be to find k, a, and b. First, plot the two points (yo, yi) 
and (yi, yo) on the nomograph. Chart 2 . Then, draw a straight line 
connecting these two points and extend to intersect the 45° line, y=x. 
The X coordinate of this point of intersection is k. Connecting this point 
k with the point ( 1 +yo, 1 +yo) and extending to the scale marked 
“a+bx” srields the value of a at the intersection. Similarly, connecting 
point k successively to points (H-yi, l+yO and (l-)-y*, l+J/s) yields 
a-f 6 and a+2b, respectively. The value of b is then determined by sub¬ 
tracting a from a+b; subtracting a+b from a+2b also gives a value of 
b and affords a check on the computations. Having determined a and 5, 
we can now list the values of o-f i>a: for which y* values are desired. The 
intersection with the 45° line of a line drawn between the point k on 
the scale and 0 - 1 - 6 ® on the top scale yields l-f-y* on either the y* 
scale or y^+i scale from which y, is readily determined by subtracting 1 . 

The mathematical solution of this problem in terms of yo, yi, and y» is 

ft = [2(l/yi) - (l/yo) - (l/y*)]/[(l/yi)* - (l/yo)(l/ys)] 

= (k/yi) — 1 where « = 0 , 1 , 2 , and 



XITTING LOGISTIC CUBVEB 


551 


where a and b are determinable from the last equation. 

It can be shown* that if an equation f(F, G, H) can be written as a 
determinant in the following form: 

Fi F 2 1 
Gt. G** 1 =0 

Hi Hi 1 

and three functions scales are drawn with coordinates Xp=Fi, Yr=Fi, 
Xe=Oi, Yo—Gi, Xe=Hi, Ys^Ht then the intersection of a straight 
line with these three scales yields a solution to the equation/(f?', 6, H). 

The above equation for k may be written in the following third order 
determinant: 

1/k l/k 1 
1/yo 1/yi 1 =0 

1/yi 1/2/i 1 

which is the necessary and sufficient condition that the three points 
(l/k, l/k) (1/3/0,1/3/1) and (I/3/1,1/3/O are coUinear, when the function 
scales are determined as above. The above equation for c“+® may be 
written: 


5/k 0 1 

«/(l + 2 /.) m /(1 + 2/.) 1 , Q 

Jea+6*/(l + g«+».) ^ 1 

which is the necessary and sufficient condition that the three points 
(1/li:, 0), (l/l+Vt, 1/1 +y*) and (e“+‘*/[l+®“^]> 1) are collinear, when 
the function scales arc determined as before. Thus Chart 2 is basically 
double reciprocal paper and the scale at the top (from the above de¬ 
terminant) is y=/I and X =8e«+**/(1 +6®+**) where 6 and are the total 
width and length of the chart grid, respectively. 

The nomograph has five dotted lines showing the solution of the 
following specific problem: Given 3/0=2,3/1=4, and 3/s=5, to find k, a, b 
and some intermediate value of 3/x, say, y.s. Rist, the points (2, 4) and 
(4, 5) are plotted. The dotted line through these two points is extended 
to the line y=x. Then, reading the x coordinate of intersection on the 
y* scale yields the value of k, namely, 5.3. The dotted line through the 
points (5.3, 0) and (3, 3) intellect the a+bx scale at .51, which is the 

* See Williamson, W. R. and Rasor, E. A., “Some Nomographic Theory and Applications to 
Benefits Under Retirement Plans,” Record of the ATnerican liuxUvAe of Actoanee, XXX, 1941. 



552 


AMBBICAK STATISTICAli ASSOCX&.TION JOURNAL, DECNBCBER 1949 


value of a. Likevdse, the dotted line through the points (5.3, 0) and 
(5, 5) yields —1.10 for the value of a+6, and the dotted line through 
the points (5.3, 0) and (6, 6) 3deldB —2.7 for the value of o+26. By 
subtraction (—1.10—.51), the value of 6 is —1.6, which is the same as 

o*bx 


eoiq oaea • • S ^ "“SS* 


■ ■■■HI 

Sil!5ii! 

SS'I!!! 

iliiliiill 

■ ■■HMaiaav mil ■■■■■■■■■■ 

■ ■■■■■■l■■■llll■■ ■■■■■■■■ 

■■■■HaiSSwiMiSiSiiwSSSS 

■1 

■ 

m 

mi 

[£ 

1 

.j 

= 

= 

Saa 

■MB. 

Sii 

|iiiiiiiiiia 

■ 

■i. 

m 

m 

i 

n 

m. 

i' 

m 

WM 

i 

iiiiiiiiiiii 

!S 

■ 

■ 

Wj 

■1 

■ 

■ 

m 

■ 


2 

■ 

nillllllllll 


irHiiiiii 

Il■■■lllllllillllllll■l■l 

■i 





■II 

III 

■ 

9 




■lllllllllll 

■1 

■1 


■ 

■ 


■IIIUIIIII 

llll■lll 


lllllllllll■■■l 

■ii 

m 

m 



■II 

■■■ 


L. 





■1 

■1 


_ 


_ 

■llllllllll 

llll■lll 

ll■llllll 


lii 

m 

m 

- 

_ 

■11 


U 

r- 


_ 

- 

■ lllllllllll 

■1 

HI 

■j 



- 

- 

■ IIIUIIIII 

IIIIHIII 
■ ■iin 

IIOIIIIIII 

1 IS H Hlllll mil II III HBI 
ll■■HIII■■llllllllll■■■l 

■ ■■HHsiiiBiiiiiiiiiiam 

’ll 

■ 


1 

= 

= 

■II 

■■Bl 

■■■1 

iin 

IBIJR 

twm 

■ 

■ 


I 

E 

= 

■ lllllllllll 

■1 

■r 




■ 


■il■ll■l■l■■ 

■llllllllll 

IIIIH 

iMiim 

tKiim 

iiiiiai 

iiiiiiii 

imiiiili 

l■Kllllil 

liKiiiiii 

(■■null 

j■SJj||S 

iiiiiiii 

ll■■Hlllr■l^lllllll■■w 

■ mil ■■■■■■■■! 

■ ■■■■iiiiiiBiiiiiiiiiaaarii 

■■■MaBiiBiiBiiiiiisiiiaHa'ai 

■ aval 

iiiiiiiiiiiiiiiiiiiiiiii 

■' 

■ 

■1 

■ 

■ 

m 

i 

m 

i 

1 

m 

H 

i 

m 

B 

i 

■ ■■ 
■ 11 
■■a* 
■■an 
■an 
Han. 
■■ai 

Ss: 

iii 

IHH 

!!■ 

■ ■Bi 

■ ■H 

*■■1 

iii 

■ 

Bi 

■ 

i 

■ 

1 

E 

E 

■ 

H 

1 

i 

HUIIIIIHIf 

MIIIBaaB^BBI 

■I■II■■''J■■■■I 

Siiiiiiiiiii 

Kit 

■ I 

■ 1 
■ 1 

■ ■ 

ii 

■i 

PI 

PI 

ii 

ii 

1 

■ 

■ 

■1 

■ 

■ 

i 


H.Si.aiaiii 

iiiiiiiiiii 

IIIIWII 

ll■llill 

lll■■lllllllllilllllNl■l 

1 

■ 

■ 

■ 

H 

■II 

III 

1 

■ 

■ 



■lllllllllll 

II 

II 

II 

1 

1 


■lUIIIIIII 

IIIIWII 

iiaiiiii 

ll■■■llrllllll1lllllll■l 

■ 




m 


III 

1 

■ 

1 

fA 

■ 

■lllllllllll 

II 

II 

II 

■ 

■ 


■IIIIIIUII 

IIIIBIII 

iiauji 

ll■■■llLl■lllUllllrl■■l 


_ 


Wk 

H 

5ii 

III 

■ 

■ 

Va 

■ 

■ 

■iiiiiiiiiii 

■1 

■1 

II 

■ 

1 


■inHiiin 


ll■l(lll 

ll■■■ll^■llllllllllN■l■l 

■ 

L 


IM 


■II 


■ 

VA 

_ 



■lllllllllll 

■1 

II 

II 


■ 


■llllllllll 


ll■lllir 

ll■■■IIUIIIIIillllll■l■l 

■ 

■ 



_ 

■II 



- 

- 


_ 

■lllllllllll 

■1 

II 

II 


■ 


■ IIIHIIIII 

!!!!SI!I 

ISHlIif! 

II■■HII1I■IIIIIIII^I■BW 

lll■■IIJIIIIIIIIIIll■■■l 

■ 



■ 

_ 

■II 


■ 

■ 

■ 




■1 

■I 

II 


■ 


■llllllllll 

■llllllllll 

■ ■IIMIII 

■■■iini 


■ 

Vi 


h- 

_ 

mmmK 


■ 


- 

_ 




■1 

!■ 


■ 

- 


jiiiSlli 

iiiiijif 

iiiliiiiiiiiiiiliuiliii 

ii 

i 

1 

■ 

“ 

Sii 

iii 

i 

■ 

1 

■ 

■ 

■lllllllllll 

iii 

iii 

ii 

i 

i 

1 

HIIHIHIIH 

■iiiiimii 


iiiiiiii 

iiiiiiriiiiiiiiiniiiiiii 

iia 

■ 

■ 

■ 

ta 

■II 

III 

1 

■ 

1 

■ 

■ 

■lllllllllll 

III 

III 

II 

1 

1 

■ 

■llllllllll 

IIIIWII 

IIIIIIII 

lll■■lbllllmllllllll■l 

■ 

■ 

■ 

rA 


■II 

III 

1 

■ 

■ 



■lllllllllll 

Ij 

II 

1 

1 

1 

■ 

■iiimiHi 

IIIIWII 

11111111 

iiiiuiiiiiiiiiii'iiiiiiii; 

■ 

■ 




■II 

III 

■ 

■ 

1 



■lllllllllll 


II 

1 

■ 

1 

1 

■llllllllll 

IIIIHIII 

Iiiiiiii 

MIHHIUlilllHIIIIIIICil 

■ 

rA 

■ 



■II 


■ 





■iiiiiiimi 


II 

1 

1 

1 

■ 

■iiiiiiini 

IIIlHill 

IlllfliU 

tllBHIJIlllIHliilllllBIJil 

IK 




m 

■ H 

■■■ 

■ 

■ 



H 

Hllllllllil 

■| 


■ 

■ 

■ 

■ 

■iifiiiiin 


ll■lllll 

ll■■Hlin■llllJlllli■^■> 

■ 


= 

_ 

B 

■ ■1 

■■■ 

- 

- 



- 

HIIMIIIlil 



■ 

- 

■ 

■ 

■llllllllll 

■ IIIHIII 

■■I'S'Si 

■■I'SIS 

■■■Hill 
■ ■Hlllll 

■SSi!!!' 

iiaiiiiii 

IIIMIIIII 

iiBHHiiiimiriiiiiim^i 

■iHaHiiiiaiiiiiiiiiiri':iHi 

■ baaaiii.aaMii ■■■■■'’..■anal 

■ ■■■iBiii'ia «nii« ■■■■■» 

X 

m 

m 

,!SI 

m 

M 

'iwm 

2 

■■1 

a:! 

■■■1 

mmmt 

■■■ 

■ ■M 

■ ■■ 

■SE 

iXai 

■ HH 

■ SS 

■ 

Bi 

5 

■1 

1 

= 

3 

m 

S 

s 

■BBIIUBBaaBI 

==3:£s::r== 

MBl.BBBBBBB 

I 

1 

■ 

c 

p 

p 

■ 

■ 

p 

p 

p 

p 

p 

p 

m 

m 

■llllllllll 
■l■llll■■«= 
■ li-ISaSIBB 

ss:!::!ss:! 

SiiiihSiIhh 

iiiiiiii 

iiiiiiii 


i 

i 

i 

i 

m 

■ii 

iii 

i 

i 

■ 


i 

■ lllllllllll 

S! 

i 

i 

i 

i 

i 

iiiiuitiii 

iiiiaiii 

ll■lllll 



■ 


■ 


■■I 


■ 

■ 

■ 

■ 

■ 




■ 

■ 

■ 

■ 

■iiiiiiiiii 

[sllsili 

■■■iiiiii 

■ ■Sniii 

iiiiiiii 

If^MIIIII 

■■rJBraiiiiaiiBavi 

iiiiiniifiimiiiiiiiiiii 

■ ■■■TWJiiiyBWiiiigtiBaaMHHi 

■5 

ii 

■ 

i 

■ 

m 

■1 

= 

■■1 

Baaai 

s:; 

■ij 

■■■ 

■ MH 

iii 

■ aiH 

■ 

BI 

i 

m 

Bi 

■ 

m 

m 

■ 

H 

■ 

H 

■■■■■■■■■■■■I 

■1 

ii 

pi 

1 

■ 

i 

■ 

i 

■ 

p 

i 

i 

MiiiHiaiai 

Pll.l.aiB.BB 

■iiiiiiiiii 

iiiiiiii 

’’.WMIIIII 

iiiiiiii 

■ ■■MHIMrBlWIIIViaBBaHHHI 

iiiiiiiiiiiiiiiiiiiiiiiii 

ii 

i 

i 

ii 

= 

■ii 

iii 

i 

a 

i 

i 

5 

■ 

= 

■ 

■iiiiiiiiiii 

Si 

c: 

11 

= 

i 

i 

i 

i 

■llllllllll 

























■ ■Msliii 


E 




P 

■MBI 


■i 

■■ 

■■ 


■■ 


Si 

■ 1 

s 

s 

s 

■1 


eq o 

9q o 

S22 

2 2 S 3 2 2 


1 




3 

2 

1 

1 



\ 

5*3 a 



3 




2 3 


CHART 2. NOMOGRAPH FOR SOLUTION OF y 


k 


obtained from the other subtraction, —2.7—(—1.1) = —1.6 so the 
computation is checked. 

To find y,i we first determine a+.5&, which in this case is 
.51+.5(—1.6) = — .29 and the x or y coordinate of the intersection of 
the 46® line with the line drawn through 6.3 (the value of k) on the 















FITTING LOGISTIC CURVES 653 

y* scale and —.29 on the a+bx scale yields 1+y 6=4.1 or y 6=3.1, the 
desired result. 

In summary, it may be seen that the nomograph of Chart 2 has 
scales in simple linear form and may be entered directly without any 
preliminary calculations, while the result likewise is obtainable di¬ 
rectly without any supplementary computations. A wide range is pres¬ 
ent for the various values of i/, and if necessary this may be further 
broadened in three ways, by enlargement of the chart, a change of 
decimal place in all three given values of y, and by dividing the original 
data by yo. These advantages indicate a much easier fitting of the 
logistic curve by a nomograph of the type of Chart 2, rather than one 
such as developed by Spurr and Arnold. 

An even simpler nomograph for determining 1/k only would be to 
use a sheet of ordinary linear coordinate paper with a line drawn at 45® 
and the scales marked 1/y* and l/y»H in lieu of y* and shown in 
Chart 2. However, in using such a chart it would be necessary to pre¬ 
determine the reciprocals of y and the result would yield the reciprocal 
of fc. The scale for a+bx could also be placed at the top as before, and 
resulting values for intermediate points would be l/(y*+l). 

If more than three points are available, plotting all the points as 
above instead of a select three will show that the curve is of the 
logistic type if they lie on a straight line. On the other hand, if the 
points are not co-linear, an average value of k can be obtained from 
the chart. 



ON THE BEST CHOICE OF SAMPLE SIZES FOR A 
t-TEST WHEN THE RATIO OF VARIANCES 
IS KNOWN 


John E. Walsh 
The Rand Corporation 

The situation considered is that of testing the difference of 
the means of two normal populations on the basis of a sample 
from each population, where the ratio of the population vari¬ 
ances is known. The choice of sample sizes has been restricted 
to certain pairs which are equally preferable from the view¬ 
point of practical considerations (cost, difficulty of obtaining 
sample values, etc.). This note presents an easily applied 
method of determining which of these pairs of sample sizes 
yield the most powerful one-sided and symmetiical tests. 


I. INTRODUCTION AND STATEMENT OP RESULTS 

T he most powerful one-sided and S3nnmetrical tests for comparing 
the difference of the means of two normal populations (variances 
unknown, ratio of variances known) with a given hypothetical value 
Do on the basis of sample values xi, • • •, a;» from the first population 
and yi, ^ • fVm from the second population are based on the t-statistic. 

,,, (x - y - Do)V{n + m - 2)/{d/n + 1/m) 

(1) -- — ~ ' -.zn: -; 

Z (»< - *)® + Z (j/. - y)® 

where B equals the ratio of the variance of the first population to the 
variance of the second population (see [1]). This statistic has a t-distri- 
bution with 2 degrees of freedom when the null hypothesis that 

the difference of means (mean of first population minus mean of 
second population) equals Do is true. 

For the situation considered in this note, it is assumed that a num¬ 
ber of pairs of sample sizes (n, m) have already been determined such 
that from the viewpoint of cost, inconvenience, etc., these pairs are 
equally preferable to the person applying the test; it is also assumed 
that the value of B is known, either from past experience or by other 
means. The problem is to determine which of these given pairs of 
sample sizes yields the most powerful test (one-sided or symmetrical 
at the specified significance level) of the hypothesis considered. From 
above, the most powerful test will be based on the t-statistic (1) so 


554 



SAMPLE SIZES FOB A t-TEST 


555 


that this problem reduces to that of determining the pair of sample 
sizes which yield the most powerful test based on (1). 

It is found (approximately) that the pair of sample sizes (n, m) 
which yields the most powerful one-sided test at significance level « 
and the most powerful symmetrical test at significance level 2a is that 
which furnishes the smallest value for the quantity 

(2) (e/n + 1/m) [1 - K„^/2(n + m - 2) ], 

where Ka is the standardized normal deviate exceeded with proba¬ 
bility a; i.e., Ka is defined by 

f 

v2tJ 

This ciitcrion for choosing the pair of sample sizes which yields the 
most powerful test is reasonably accurate for n+m^6 if a==5%, 
n+m^7 if a = 2.5%, n+m^8 if a=l%, n+w^9 if a£=0.5%. 

II. A FIELD OP APPLICATION 

The results of this note are of most value when the ratio of the popu¬ 
lation variances can be considered known but the values of the vari¬ 
ances are unknown. Some situations of this type are outlined below: 

Let us consider a situation where the same treatment is applied to 
objects representative of populations of two different types. Then 
some common characteristic of the two types of objects is measured. 
The values thus obtained represent samples from each of the two t 3 rpes 
of populations, and these samples can be used to compare the relative 
effect of the treatment on the two types of objects (with respect to the 
specified characteristic). For example, the treatment might be a certain 
type of feed, the populations two different breeds of hogs, and the 
characteristic the weight at some specified future date. 

Next let us consider the effect of different treatments on a fixed 
population. The mean and variance of the theoretical population of 
the values of the specified characteristic will vary with the treatment 
used. A not too uncommon situation is that where the variance of the 
theoretical population is a slowly increasing function of the mean of 
that population; i.e., if the magnitude of the observations increases, 
the variance of the observations also increases. Then a treatment which 
yields a theoretical population with a large mean value will yield a 
larger variance for this theoretical population than a treatment which 
results in a theoretical distribution with a smaller mean value. In the 
remainder of this section it will be assumed that the theoretical popu- 



656 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

lations obtained for the specified characteristic have this property. 

Finally let us consider applying a new treatment to two different 
populations of the type described above when data concerning the 
effect of some other treatment on these same populations are available. 
In many cases it can be assumed that the relative effect of the new 
treatment (with respect to the old one for which data are available) 
will be either to increase both of the theoretical population means or to 
decrease both of them. For example, an improved feed would be ex¬ 
pected to increase the average weight of both breeds of hogs (though 
not necessarily the same amoimt) while a less satisfactory feed would 
be expected to decrease both average weights. If this is the case, the 
ratio of the variances of the theoretical distributions for the new treat¬ 
ment will tend to be about the same as the ratio of the corresponding 
variances for the old treatment; this ratio is therefore approximately 
known. Since the theoretical population means for the new treatment 
could be either larger or smaller than the corresponding means for the 
old treatment, however, estimation of the variances of the theoretical 
distribution for the new treatment from the corresponding values for 
the old treatment usually can not be done with any reasonable degree 
of accuracy. Thus the ratio of variances is available but the values of 
the variances are not. 

III. EXAMPLE OP APPLICATION 

Let us consider a case in which the only practical consideration is 
cost. Here the cost of an observation from the first population is always 
$10, the cost of an observation from the second population is always 
$100, while the total cost of the experiment is limited to $400. Then 
the three pairs of sample sizes 

(10, 3), (20, 2), (30, 1) 

are equally preferable. Also for this particular situation it is known 
from past experience that 9=2. 

Let us determine which pair of sample sizes yields the most powerful 
symmetrical test at the 1% significance level. Then a=0.6%, 

6.62 and the following table is obtained: 


(», m) 

(2) 

(10, 3) 

0.76 

(20,2) 

0.72 

(30,1) 

1.20 


Thus (20, 2) yields the most powerful symmetrical test at the 1% 
significance level. 



SAMPLE SIZES FOB A t-TEST 


657 


A rough measure of the gain from using (20, 2) rather than (10, 3) 
is obtained by observing that (16,2) furnishes approximately the same 
power as (10, 3). Thus about $40 is lost by using (10, 3). 

If the significance level of the symmetrical test had been 10% rather 
than 1%, the following table would have been obtained: 


(«. m) 

(2) 

(10, 3) 

0.61 

(20,2) 

0.64 

(30,1) 

1.12 


Then (10, 3) would yield the most powerful test rather than (20, 2). 

Since for fixed n and m the i>ower of a one-sided or symmetrical 
t-test based on (1) increases monotonely as the value of 

(TiV^ + + 1/m) 


decreases variance of first population, or2^=variance of second 
population), it might be thought that the pair of sample sizes which 
furnishes the smallest value of djn+l/m would yield the most powerful 
test. That this is not necessarily so is seen by considering the above 
examples. The criterion based on d/n+l/m is independent of the 
significance level while the two examples show that the choice of the 
pair of sample sizes which yields the most powerful test varies with 
the significance level. 

IV. DEBIVATIONS 

This section presents proof of the statement that to a reasonable 
approximation the pair of sample sizes which yields the most powerful 
one-sided test at significance level a and the most powerful sym¬ 
metrical test at significance level 2a is that which furnishes the smallest 
value for (2). 

Let M be the mean of the first population while v is the mean of the 
second population. First consider the one-sided test of iJi,—v<Do 
at significance level a and based on samples of sizes n and m respec¬ 
tively. Using a modification of the normal approximation given in 
[2], it is found that the power fimction of this type (1) t-test is approxi¬ 
mately equal to 


N 


{ 


-Ka + 


(Do — ju + »>) 


[1 


- 2!:.V2(n + m- 



where by definition 


N{z) = — 4 = f 

V2»J — 



558 


AMEBICAN STATISTICAL ASSOCIATION lOITBNAL, DECEUBEB 194» 


This approximation is reasonably accurate for n-|-m^6 if a=5%, 
n+m^7 if a=2.5%, »+m^8 if «=!%, n+m^9 if a=0.5%. 

Thus for it—v<Do and any permissible fixed values of a, vi* V2*, 
jDo, It, V, the value of the power function is approximately largest when 
n and m are chosen so that (2) is as small as possible. This verifies the 
statement for the one-sided test oi pl—v<Di,. By symmetry the same 
result holds for the one-sided test of fi—p>Do. 

Again using the modified normal approximation, the power function 
of the symmetrical tsrpe (1) t-test of ii—v^Do is approximately equal to 


(3) 


J p —Kiot+4 —Kct 

2a —= I e-*’/*da; - -= I 

J —If _ •\/2ir J —a 


where 


I Do “ M + I' 
-f- 


[1 - Ka^/2{n + m - 2)Yf\ 


Since 2a <1, a <0.6 and (3) is a monotonely increasing function of 
S. Thus for any jSxed values of a, cri*, cr^y Do, fi, v the value of the power 
function is approximately greatest when n and m are chosen so that 
(2) is minimum. This verifies the statement for symmetrical tests. 


REFEREI^ES 

[1] Henry Sclieff4, "On solutions of the Behrens-Fisher problem based on the t- 
distribution,” AnnaU of Math, Stat,, Vol. 14 (1943), p. 43. 

[2] K. L. Johnson and B. L. Welch, "Applications of the non-central t-distribu- 
tion,'' BioTnetnka, Vol. 31 (1940), p. 376. 



NOTE ON SOME ERRORS IN “THE EVIDENCE OF 
PERIODICITY IN SHORT TIME SERIES” 


Several fundamental errors are in Truman Kelley’s “The 
Evidence of Periodicity in Time Series.” These involve a con¬ 
fusion of numbers of observations and degrees of freedom, the 
interpretation of probabilities obtained in tests of hypotheses 
and the appropriateness of periodigram analysis for detecting 
the existence of periodicity in time series. 

I 

T he article by Truman L. Kelley, “The Evidence of Periodicity in 
Short Time Series”^ contains several fundamental errors which 
should be noted. These particular errors are fairly widespread in^popu- 
lar literature and in practical applications. If these errors occur in this 
Journal and are not corrected their propagation will be insured since 
Kelley is deservedly a leading authority in Statistics. The younger 
generation, as well as the older, frequently rely on authority. 

Originally, in 1943, the writer had considered it not worthwhile to 
write this corrective note. But recently on two occasions he has seen 
others using the technique presented by Kelley. On several other oc¬ 
casions during the war, interpretative errors to be mentioned below 
were made by persons who had taken formal university statistics 
courses and who were regular users of statistical methods. It is believed 
worthwhile, therefore, to point out these fundamental errors. 

1. In testing goodness of fit of a trend line to a time series, Kelley 
treats the number of observations as independent observations in com¬ 
puting the number of degrees of freedom.^ It is obvious (practically) 
that the observations are not independent. 

2. In estimating and testing the period of a series of residuals, after 
removal of trend, the number of observations is again used in order to 
determine the number of degrees of freedom. In this particular analysis 
it might be contended that if periodicity does not exist in the residuals 
the residuals are independent; and hence, the assumption of inde¬ 
pendence is acceptable for purposes of testing the null hypothesis of 
non-periodicity. This contention would overlook the fact that the 

^ Truman L. Kelley, “The Evidence of Periodicity in Short Time Series," Journal of the Avuriean 
Statiettetd Society (1943), Vol. 43, pp. 319-326. 
a Ibid., pp. 320-325. 

H T. Davis, The Anedyais of Economie Time Seriee (Bloomington, Indiana, 1941). 

H. Wold, A Study In the Analyeis of Stationary Time Series (Uppsala, Sweden, 1938). 

Maurice G. Kendall, “On the Analysis of Oscillatory Time Series, "•■TotimaZ of the Royal Stotistieci 
Society, Vol. CVIII (1945), p. 93. This more recent article is cited because of its general ezcdlence and 
comprehensiveness. 


559 



560 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1^9 

power of the test of non-periodicity as against various alternative 
lengths of periods is a function of the number of degrees of freedom 
(ignoring the extension into the sphere of varying amplitudes). As al¬ 
ternative hypotheses about the length of the period are admitted, the 
number of degrees of freedom applicable changes. In particular, the 
longer the period the fewer the degrees of freedom. If the degrees of 
freedom were constant, the test would have a greater probability of 
rejecting the non-periodicity hypothesis as the admissible alternative hy¬ 
potheses of other period lengths are broadened to include longer periods. 
However, the degrees of freedom do decrease so that on balance if one 
assumed a fixed number of degrees of freedom he would not be con¬ 
trolling the probability of errors of the first type—^namely rejecting the 
non-periodicity hypothesis when the series is in fact non-periodic.® 

3. A common error of interpretation is present also. “Values of pro¬ 
gressively less and less than .6 provide greater and greater evidence 
that b is not a chance deviation from zero.”^ By itself this statement is 
correct if it means that small values of P make it harder to accept the 
hypothesis that b is equal to zero (for a given power function). But it is 
apparent that other connotations are implied. “As P from this variance 
ratio is less and less than .5, there is greater and greater evidence that a 
period in the neighborhood of T exists.”® If the test is biased there is no 
such evidence. The degree to which evidence is discriminatory is de¬ 
pendent upon the power function. In simple terms, evidence favors one 
alternative hypothesis only if it is relatively more probable under it and 
relatively less probable under other hypotheses. The following state¬ 
ment appears a little later: “ . . . P=.048. Having odds of 952 to 48 
that —1877.9 is not a chance deviation from zero. . . . ”* The second 
sentence is a non-sequitor. If it were a valid extension, it would mean 
presumably that of all the times that one observes such deviations 
(those gi\dng P=.048) .952 of such cases would in the long run have 
occurred from parameters not zero. Only by using Bayes' theorem can 
one obtain the type of statement employed by Kelley. The proportion 
of times in which such deviations are observed is a function of the ex¬ 
perienced (true) parameters and not at all a result of the particular 
observed sample in any given case. What the probability does mean is 
that the probability of getting such a chance deviation or worse, if 
there is non-periodicity, is equal to .048. One cannot simply interchange 

» J. Neyman and E. S. Pearson, “On the Problem of the Moat Efficient Tests of Statistical Hypoth¬ 
eses,” PkU, Trans. Royal Society of London, Series A, Vol. 231, p. 289. 

* Kelley, supra., p. 320. 

tlbid., p. 323. 

•Ibid., p. 325. 




NOTE ON TBTJMAN KELLEY’S ARTICLE 661 

the role of the hypothesis and the sample evidence and still have a 
correct statement.’’ 

Later there appears “The smallest P is .152, found when T= 16. Thus 
the odds are about 5 to 1 that a period in the neighborhood of 16 years 
exists.”* In this statement and procedure of getting a P of .152 several 
errors are combined, a) The selection of the smallest probability from a 
set of independent probabilities will result in obtaining too many (more 
than woidd be obtained by a random selection) significant probabilities. 
This point has been covered by others.® It is especially applicable to 
periodigram analysis, b) The probabilities from the periodigram are 
not independent so that it is impossible to apply correct methods de¬ 
signed to test the significance of the largest (or smallest) of a set of 
independent probabilities, c) Finally the second sentence is again a 
nonsequitor. Suppose the probability had been .001. Would that then 
indicate that the odds are 999 to 1 that this is not a chance deviation 
and is instead a real difference? Obviously not. It merely sa3rs that if 
the truth is non-periodicity, then one would obtain by chance such a 
deviation only once in 1000 times (more accurately with probability 
.001). One cannot interchange the role of hypothesis and sample evi¬ 
dence! Consider the other extreme probability; suppose the probability 
in the periodigram for a particular period is 1.00. Clearly that is not 
proof that such a period is not in the series! Not only is the second 
sentence a rumseguitor, it is wrong. If it were correct it would be neces¬ 
sary that of all the cases in which deviations with probability .15 were 
observed, .85 of those cases would (‘on the average’) have come from 
situations in which there really was periodicity of length r=16! This 
requires Bayes’ theorem. There is no indication that Kelley has any¬ 
where in mind any a priori parameter distribution. And if he did, the 
simple complement of the probability is not the correct one if one wishes 
to apply Bayes’ theorem.^® The appropriate formula is somewhat more 
complex. 

In summary of this particular error, the confusion appears to arise 
basically from a failiue to understand adequately the fundamental pro- 
cedme by which one increases his knowledge on the basis of evidence. 
Always one must ask how probable is the evidence that has been ob¬ 
served, if a certain hypothesis or cause is the true one. He must also ask 


7 Neyman and Pearson, supra. 

B Kelley, supra, p 326. 

* W. G Cochran, *The Distribution of the Lar^t of a Set of Variances as a Fraction of Their 
Total,” Annals of Eugenics (London), Vol 11, p. 47. 

^ J y. Uspensky, Introduetion to Mathematieal ProhtMdy (New York: McGraw-Hill, 1937), pp 
60-73 



562 AMEBICAN BTATISTICAli ASSOCIATION JOURNAL, DECEMBER 1M9 

the same question about the probability of the observed evidence under 
the assumption that it came from some other hypotheses. After com¬ 
paring these probabilities of obtaining the evidence (not of the hy¬ 
potheses!) he can venture to draw conclusions. Never can he draw a 
conclusion until he has so obtained these different probabilities. At 
worst, he must make some intuitive guesses about the relative probar 
bilities under various possible hypotheses. Lacking even an intuitive 
guess he can draw absolutely no conclusion whatsoever! And unless one 
uses Bayes’ theorem, he simply cannot proceed to the next desirable 
(but nevertheless impossible step) of making a statement about the 
probabilities of the hypotheses!“ 

4. Finally, the basic procedure used to detect periodicity—^the use 
of a periodigram—^is inappropriate. The periodigram technique used is 
appropriate to determination of the length of the period if the series is 
known to contain a period. Kelley’s problem however is that of deter¬ 
mining whether or not one exists.^ 

The writer regrets the occasion for this note, but these same errors 
are altogether too frequently committed elsewhere. It is another exam¬ 
ple of the adage that a statistician probably spends more of his time 
telling what not to do rather than doing. 

Akusn a. Alchian 

Univsbsitt or CAuroBNiA, Los Anoblbs 

II 

I do consider the number of observations as the number of inde¬ 
pendent items of information, or the number of degrees of freedom. No 
matter how brief the time interval between observations, if they are 
not consequent to the same instrumental errors—^includii^ judgment 
errors if estimates have been made—^I would say that there is justifi¬ 
cation for calling them independent measures. If there is some other 
standard I fail to grasp it. However, I do not see that this point is 
crucial to Alchian’s further criticisms. 

Before discussing the main issue, which is that of inverse probability, 
I will comment upon the criticism of my statement that “The smallest 
P is .152.... Thus the odds are about 5^1 ....” The value P=.152 
was taken from a table giving P values for 17 different periods, 2-years, 
3-years,... 18-years. As the value in that table there is no act of selec¬ 
tion, but the moment it is taken out of that table because it is the 


u Neyman and Pearson, mtpra. 
u M. G. Kendall, aujn-a. 




NOTE ON TET7MAN KBLLET’b AETICLE 663 

smallest and put in the text there is an act of selection, so I admit that 
my statement, correct in the table, is a misstatement in the text and 
that Alchian’s point (a) is sound. I am not sure about his point (b), but 
if in place of “impossible to apply” he had written “impossible for 
Kelley to apply” I would fully subscribe to it. His point (c) is in part 
covered in the first of this paragraph and is otherwise approached in 
the following discussion of inverse probability. 

In the universe of situations in which designated null hypotheses do 
not hold other null hypotheses will hold. One very large class of these 
will be those in which the shift of a single parameter will lead to a null 
hypothesis which now holds. True, this is just a sub-class of the uni¬ 
verse of situations, but a sufficiently large sub-class that a statement of 
probability covering it will be useful in determining conduct. In fact it 
seems to the writer that for most, perhaps all, of the problems of life a 
more valid determiner of conduct is unavailable. To illustrate: We 
have a null hypothesis that the true mean is Mi. The actual sample 
mean is M and we find that P=.048. That is, if the true mean is Mi a 
deviation as great as that found will, in the long run, occur in .048 of 
the samplings. Now consider the divergence from the sample mean M 
of the true mean M. All parameters except the mean being constant in 
many samples of different sorts all of which have a sample mean M the 
true means will diverge from M by an amoimt greater than (Mi—ilf) 
in .048 of the samples. Thus in this restricted realm inverse probability 
holds. (We may note in passing that this restricted realm may be far 
less restricted by making linear transformations of the original varia¬ 
ble.) 

Let us say that this probability is small enough to lead to a certain 
course of action. Nothing but time and further experience will prove 
whether the course of action is sound or not. A double risk has been in¬ 
volved: first the P == .018 instead of some smaller amount and second is 
the materiality of other parameters than the one studied. As a matter 
of logic both of these risks must be present. 

To sot up alternative hypotheses and determine which of the two is 
more acceptable does not change the situation. It sharpens the issue so 
far as P is concerned, but it may augment the other risk for we must 
ask if the two hypotheses were the best two to use in determining 
future action. 

Though I subscribe to the soimdness of Alchian’s reasoning I hold 
that the use of inverse probability has much practical warrant and 
usefulness. I suspect that in deciding upon a course of action from given 



564 AMERICAK STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

data Alchian and I would reach the same conclusion, thou^ we em¬ 
ploy somewhat different logical processes en route. 

.Mchian’s final point—^that the periodogram cannot be used to de¬ 
tect periodicity—^is indeed sweeping. I know of no logical approach to 
the question of existence independent of that of amount to which the 
thing exists. Dr. E. L. Thorndike’s oft quoted dictum “If a thing exists 
it exists in some amount” has been most serviceable in the field of psy¬ 
chology. It should equally serve other fields. 

1 appreciate the fine tone and precision of Alchian’s article and have 
the hope that we are really not as far apart as his criticisms imply. 

Tbitiian L. Keixet 



WILLIAM LANE AUSTIN (1871-1949) 

JAMES CLYDE CAPT (1888-1949) 

Two former Directors of the Bureau of the Census, both valued mem¬ 
bers of the American Statistical Association, have passed away in 
recent months. William Lane Austin, Director of the Census from 
1933 to 1941 and a Senior member of the Association, died in Green¬ 
ville, Mississippi on October 10. James Clyde Capt, who succeeded 
Austin as Census Director in 1941 and remained Director until his 
resignation because of illness on August 10, 1949, died in Washington 
on August 30. 

A review of the careers of these two men—alike in some respects and 
differing widely in others—^brings into sharp focus some of the changes 
during recent decades in our concepts of the role of statisticians in 
government service. Austin entered the temporary Census Office in 
1900, two years before the Permanent Census Act created the present 
Bureau. He served successively as statistician in charge of the Census 
of Plantations; Chief Clerk; chief statistician in charge of the Censuses 
of Agriculture in 1920, 1925 and 1930; Assistant Director; and finally 
Director. 

During this 40 year period he learned his statistics from experience 
with Census operations. And despite the brilliant analyses of census 
problems by General Francis A. Walker and some other former Di¬ 
rectors, it was not generally recognized that Census operations required 
special skill or presented technical problems. Austin developed hard- 
headed shrewdness and skill as a negotiator and as the supervisor of 
personnel whose training in most cases, like his own, had been limited 
to experience in the ranks. It was when he became Director in 1933 
that he first displayed abilities and attitudes that co-workers of a life¬ 
time to quote the late Joseph H. Hill “had never suspected him to 
possess.” 

The immaturity that had previously characterized statistics as a 
profession was beginning to disappear and Austin felt the need to 
strengthen the personnel of the Bureau. With astonishing finesse in 
avoiding offense to life-long colleagues in the service—^men who dis¬ 
trusted “impractical theorists”—^he began building a staff having tech¬ 
nical qualifications that he did not himself possess or even understand. 
Austin was the founder of the modem, efficient, and technically com¬ 
petent Bureau of the Census that serves the nation today. 

Capt entered the Bureau of the Census in 1939, as Executive As¬ 
sistant to Austin. Unlike the latter, he was without any previous ex¬ 
perience in dealing with technical statistical problems, and was com¬ 
pletely unknown to the statistical fraternity and to those users of data 
throughout the land who view the Census Bureau with somewhat the 


565 



566 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

same veneration that lawyers hold toward the Supreme Court. It is 
no exaggeration to say that many of these were profoundly apprehen¬ 
sive two years later when the still new Assistant was named Director, 
at Austin’s retirement. 

I well remember my own doubts en route to my first call upon the 
new Director, and the sense of reassurance with which I left his office. 
Nor was the spirit of trust and understanding engendered at that first 
contact ever dimmed in any of our later official or personal relations. 
Increasingly I came to regard “J.C.,” as he was affectionately known, 
as a superb administrator. His direction of the Bureau’s affairs im¬ 
pressed me as strict, unaffected by “old school ties,” personal friend¬ 
ships or biases, and always fair and just. He was quick to admit mis¬ 
takes and shortcomings, but coupled his admissions with determiner 
tion that the same mistakes would not occur a second time. In his rela¬ 
tions as Director of the Census with other agencies, he was always co¬ 
operative, unambiguous in his position, and decisive. 

I believe, however, that the achievements of his administration 
sprang primarily from the quality that he shared so conspicuously with 
Austin, namely: respect for and deference to competence in others. 
This seems to me to have gone farther than the layman’s frequent 
deferral to the specialist, and to have included a willingness to give his 
personnel free rein, even in areas of administration in which he could 
have claimed a superior right to judgment. They must prove their 
initiative by success, but the opportunity thus presented them re¬ 
flected an inherent and commendable modesty in their chief. It made 
for staff loyalty. 

The administrations of these two Directors represent, very for¬ 
tunately, a consistent and continuous pattern of growth. The strength 
now exhibited by the Bureau of the Census, particularly in the high 
calibre of its technical and administrative staff, had its beginnings 
under Austin, and developed at an accelerated pace under Capt. The 
all-important need of the Bureau was the building up of the organizar- 
tion to a position of leadership in statistical techniques and operating 
methods. This seems to me the outstanding and culminating achieve¬ 
ment of Capt’s administration. Credit for his success must be shared 
with Philip M. Hauser, Ross Eckler, Howard Grieves and his other top 
assistants, upon whom the burdens of the Bureau administration during 
the crucial period of the Seventeenth Decennial Census now fall. Their 
tasks will be easier because of the solid process of Bureau-building that 
preceded. Stuart A. Rice 

Assistant Director in 
Charge of Statistical Standards 
Bureau of the Budget 



BOOK REVIEWS 

(Dr. Oscar Buros* resignation as Review Editor was effective August 1,1949. 
During the interim period until a new editorial committee is established, this 
section will be edited by Dr. Ernest Rubin for the Secretary's Office.) 

Probability Theory for Statistical Methods* F, N, David (Lecturer in Statistiesy 
University College, London, England). London and New York: Cambridge 
University Press, 1949. Pp. ix, 230. 15s.; $3.50. 

Review by John W. Tukbt 
Associate Professor of Mathematics 
Princeton University, Princeton, New Jersey 

S IXTEEN chapters of from 11 to 18 pages each, and a short seventeenth, 
emphasize that this book is based on a carefully organized set of lec¬ 
tures. The imprints of Karl Pearson and Jerzy Neyman are plain to see. 
On the one hand, the introduction of the binomial distribution, in Chapter 3, 
is followed by derivations and applications of the evaluations of binomial 
probabilities with the Incomplete Beta Function Table, Uspensky’s method 
using hypergeometric series, the normal approximation, and the Poisson 
distribution (which is regarded solely as a limit to the binomial). On the 
other hand, the Markoff theorem on least squares is carefully applied to 
stratified samples of minimum variance. 

Noteworthy are (i) a 17-page chapter on simple genetical applications, 
(m) the elementary derivation of sampling moments for sample moments 
from finite populations, (m) an account of differences of zero and their 
simplest uses, (iv) an emphasis on mathematical precision and the use of 
elementary methods, and (») a clear-headed middle-of-the-road attitude 
toward the relation of probability theory to the real world. While it is hard 
to name courses in the United States where such a book would be an ap¬ 
propriate text, it should be helpful supplementary reading for both students 
and teachers. (The reader is supposed to be familiar with the notation for 
third and fourth moments (p. 31).) 

As is inevitable and appropriate, there are places where the author and 
the reviewer hold different views, but in only two places were definite errors 
detected. One is in the statement of the boundedness hypothesis for the 
Central Limit Theorem on page 217, where 0^m2 should read 0<m2. The 
second is the derivation of Sheppard’s corrections on page 214, where the 
cumulants of G are expressed in terms of those for E and a? as if K and z 
were independent, when in fact E determines x completely. 

The reviewer is unable to be sure of the meaning of "The fundamental 
probability set, written F.P.S. for short, will be just that set of individuals 
or units from which the probabilities are calculated” (p. 12) in view of the 
discussion on page 68. The only other place which seems likely to confuse 

667 



568 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

the student is in the discussion of the "elementary probability law” of a 
continuous random variable, where a careless reader could conclude that 
p(a;') was "the probability that x takes the value a;'.” 

The author is definite on such points as replacing the binomial distribution 
by a continuous distribution with stepped densities, "Such an assumption is, 
of course, wholly fallacious, since the binomial probabilities are a discrete 
set of points, but it is a useful aid to memory if not pursued too tenaciously” 
(p. 53—more tenacity would have replaced npq by npg-l-(l/12) with better 
approximation), the uselessness of the negative binomial, "It would appear 
wrong therefore to carry out calculations in which p and n are given negative 
values” (p. 65) and division by n or n—1, "If a measure of the scatter in 
the sample is required then—(dividing by n )—must be calculated. If it is 
desired to estimate the population standard deviation then the expression— 
(dividing by n— 1)—may be calculated because in the long run it will be equal 
to 0 -” (p. 126). Two points need to be made in the last connection. First, if 
"in the long run” were well defined, the last statement would be wrong. 
Second, the reviewer, and, he believes, an increasing number of other 
statisticians, feel the time is coming soon to loosen the thrall of moments of 
inertia, and always divide by n— 1. 

References to further reading are given after each chapter. These could 
be slightly improved by giving dates (only 4 are given, including one on 
page 81) and by giving volume numbers in Arabic rather than Roman 
numerals (after all, statisticians no longer do arithmetic in Roman numer¬ 
als). 

After mentioning a few of the things he likes, and all of those he doesn’t, 
it behooves the reviewer to point out what he failed to find. To fill the two 
prominent gaps in a reasonable manner, Miss David would have had to 
recast her order of presentation and bring in cumulants (semi-invariants) long 
before the next-to-last-chapter. If this had been done, and A;-statistics had 
been added, then Irwin and Kendall’s {Ann. Bug. 12:138-142,1944) powerful 
and simple methods of deriving sampling moments of sample moments from 
finite populations could have been used with a great gain in simplicity of 
presentation and in power of tools. (The experience of the early workers with 
heavy algebra and consequent errors points up the need for keen and power¬ 
ful tools.) Second, the early introduction of cumulants would have made it 
possible to extend the present discussion of Lexis theory, which seems in¬ 
effective, into a presentation of the essentials of the analysis of variance 
mthout normality assumptions as developed by Pitman and Welch (cf. 
BioTnetrika^ Vol. 29). This would have been a most important addition. 



BOOK REVIEWS 


569 


QuaKty Control by Statistical Methods. G, Herdan (Lecturer in Statistics, De¬ 
partment of Preventive Medicine, University of Bristol, Bristol, England). 
Edinburgh, Scotland: Thomas Nelson & Sons, Ltd., 1948. Pp. xi, 251, 21s. 

Review by Paul Peach 
Associate Professor^ InstitvJte of Statistics 
University of North Carolina, Raleigh, N, C, 

T he internal evidence of Dr. Herdan’s book proves that he has at least 
some acquaintance with a wide variety of statistical methods. In his 
eight chapters he discusses not only the usual material of quality control 
literature (charts for fraction defective, charts for variables, and single and 
double sampling plans based on attributes) but such less customary subjects 
as correlation, analysis of variance, and inverse probability. He devotes a 
whole chapter to the use of probability graph paper, a tool that certainly 
deserves to be introduced to industrial workers. His is, I believe, the first 
text on statistical quality control to mention the important notion of com¬ 
ponents of variance. He includes notes on the t and chi-square tests, with a 
table of t (but not x*, “in order not to swell the number of appendices in¬ 
ordinately”). He has a brief note about generating functions. There is per¬ 
haps not another book in the statistical field that touches upon so many 
topics in so little space. 

Elementary books are presumably written for the instruction of students, 
the general idea being that beginners are expected to read the book and 
learn from it certain lore that shall be true, or useful, or at any rate accep¬ 
table to authority. The student who picks up Dr. Herdan’s book with any' 
such expectation had better be prepared to take a substantial consumer’s 
risk. Naturally, no ordinary book of 250 pages can include adequate exposi¬ 
tions of all the topics Dr. Herdan has introduced. I don’t believe anybody 
will learn to fit the least squares straight line by studying pages 115-118. The 
Latin square on page 102 is almost void of explanation or context. The dis¬ 
cussion of acceptance sampling is based mostly on the work of Dodge and 
Romig; but there is only half a sentence about the AOQL concept. The pro¬ 
fessional statistician learns from these pages that Dr. Herdan knows about 
linear regression and Latin squares; the beginner can learn, I fear, nothing. 

The book suffers further from a neglect of common standards of literary 
craftsmanship. Both the writing and the typography are frequently obscure 
and confusing, and sometimes we find downright misstatements that are 
obviously the result, not of ignorance, but of carelessness. On pages 3 and 
4 we find what purports to be the equation of the normal curve, but so 
carelessly set up that no novice could be expected to read it rightly. In the 
discussion of the test for the significance of the difference of two means, some 
of the formulas are right, some wrong. In the chapter on probability graph 
paper the probability density function and the ogive are persistently con¬ 
founded. 

Even in his literary references Dr. Harden fumbles. He uses for one chap- 



570 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 


ter the whimsical subtitle “The Rod of Moses”; the rod in question was 
Aaron's (Exodus 7: 8-12) and the allusion seems hard straining for a defec¬ 
tive analogy. Among his American authorities he mentions one, called 
Cowder in one place and Crowder in another; the reference is to my esteemed 
colleague, Prof. Dudley J. Cowden. This latter error is in some respects 
typical. It could have been avoided with a minimum of care; people who 
know Dr. Cowden will penetrate the veil of obscurity; others will derive no 
information from the reference. 

One wonders how Thomas Nelson & Sons came to publish this book, 
what authorities served as their referees, whether the comments of these 
authorities were transmitted to Dr. Herdan, and what was done with the 
galley and page proofs. Something must have been neglected somewhere. At 
all events, the resulting book has in my opinion no possibilities for the in¬ 
struction of students. 


Statistical Methods in Research. Palmer 0. Johnson (Professor of Education, 
University of Minnesota, Minneapolis, Minn.). New York; Prentice-Hall, Inc. 
(70 Fifth Ave.) 1949. Pp. xviii, 377. $7.65. 

Review bt Fbederick Mostelleb 

Associate Professor of Mathematical Statistics, Department of Social Relations 
Harvard University, Cambridge 38, Massachusetts 

A GREAT many people have probably been intending to write a modem 
statistical textbook similar to Johnson's. The notion is to write a rather 
advanced book suitable for students in education and psychology, suitable 
in the sense that the emphasis shall not be on the mathematics, but upon 
methods which are useful in research problems the students will face. People 
intending to write such a book now have two alternatives: they can either 
relax and forget about it, or they can raise their sights a good deal higher 
than they needed to before the publication of Johnson's book. 

The arrangement of the material resembles that of a handbook, which is 
rather appropriate in view of the title. This arrangement will facilitate use 
of this book by active research people who will find the form very convenient. 
The table of contents is helpful in this connection. The index looks impres¬ 
sive, but the only thing I looked for I was unable to find (see below). The 
methods used are described carefully, but succinctly, and illustrated by 
worked-out problems on real data. Methods given are heavily referenced and 
every effort is made to send the student to original sources. I do not have 
much hope that students will go to the sources, but do believe that teachers 
will be grateful for the references because not all who will want to use this 
book will be prepared to teach from it. This book can be enhanced by 
instmctors able to expand on the material presented. The first 200 pages 
could easily be inflated to 400 with no padding. 

Tables of the normal (a poor one), t, chi-square, F, and one of Nayer's 



BOOK REVIEWS 


571 


for testing differences in variance among several samples of the same size 
are supplied. Over the years one gets sadder about tables of the normal. Why 
is it that authors only provide half of one table? It is perfectly true that by a 
little arithmetic one can get what one wants from such a table, but tables 
are for economy in time and errors. As a start I would like several normal 
tables: 1) cumulated from the left, 2) cumulated from the right (these should 
include negative arguments as well as positive), 3) cumulated symmetrically 
from the center, 4) cumulated symmetrically from the tails. I would also like 
companion tables using probability as the argument and the deviation as 
the entry. If it is objected that this is pure laziness, the energetic may still do 
their computations from the original function. I hope some textbook writers 
and publishers will cooperate in this matter soon. 

The book opens with a much better than usual discussion of the realm of 
statistics. To the discussion of general uses of statistics in economic research 
I would add the possibility of statistical or probability models. In the chapter 
on probability and likelihood the author is not especially careful about 
distinguishing between the notions of a probability limit (p. 20) and the 
usual concept of limit. Bayes Theorem and Maximum Likelihood are too 
briefly treated, but the latter is discussed more fully later. The book begins 
to open up in Chapter III on Sampling Distributions, discussing means, 
differences between means, variance, t, r, the z transformation for correla¬ 
tions, the relation between z and F. Discussion of testing statistical hypoth¬ 
eses follows smoothly, comparing fiducial and confidence limits (making 
confidence limits sound a little more difficult than necessary), likelihood ratio 
and sequential tests. 

Standard procedures are given for testing means against standards with 
standard deviation known and unknown, for finite as well as infinite popula¬ 
tions, testing differences, the Behrens-Fisher problem, the sign test, testing 
differences in percentages using t There is a footnote on page 81 which 
bothers me in this connection—“The x* test is an exact test for this problem.” 
If no correction is made for continuity the stated formula is equivalent to 
X®, but neither test can be regarded as exact as I see the matter. Continuing 
with this chapter, methods arc given for testing equal variability for two 
or more groups, the significance of differences for correlations and regression 
coefficients, 2 X2 and larger tables, goodness of fit. I will not continue this 
parade of techniques, but mention that later chapters cover estimation, 
interval and point, normalized distributions, special devices when the data 
are nonnormal, sampling theory and practice. All this is in 200 pages. The 
second half of the book handles analysis of variance and covariance, and 
multiple regression, including the discriminant function. 

The analysis of variance examples are carried out in detail, one example 
running through 14 pages. The discussion is better than most such. Not 
enough emphasis is put on the meaning and interpretation of interactions, 
not enough is said about the limitations of the methods. I doubt that edu¬ 
cators are always interested in such questions as: Are the mean achievements 



572 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 


of students in the three grades equal, or are the school means on the reading 
test equal? This is not so much a shortcoming of the book, but a shortcoming 
of the method as it is now available. This author does give with no heading, 
and no reference in the index that I can find, Fisher's method for comparing 
two particular means selected from a set (p. 234). 

Perhaps the best thing to say is that Prentice-Hall's advertisement for 
this book is correct in all particulars—except one, I doubt if the author did 
research with R. A. Fisher (sic), although the influence of R. A. is clear. 


Psychological Statistics. Quinn McNemar (Professor of Psychology, Statistics, 
and Education, Stanford University, Stanford, California). New York: John 
Wiley & Sons, Inc. (440 Fourth Ave.) 1949. Pp. 364. $4.50. 

Review by Edmund Churchill 
Assistant Professor of Mathematics, Department of Mathematics 
Antioch College, Yellow Springs, Ohio 

P ROFESSOR McNemar's book is described as covering “all the statistical 
techniques, except factor analysis, that are frequently useful in psy¬ 
chological research.” Altho “frequently used” might be more accurate than 
“frequently useful,” and there are omissions in either case, this book does 
cover the standard topics from descriptive statistics through correlation, 
confidence limits, chi-square, sampling, and the analysis of variance, plus a 
chapter on the too often neglected topic of analysis of covariance. 

The book, unfortunately, falls short of the author's aim to “provide a 
concisely, yet clearly written textbook which will lead to an appreciation of 
the place of statistics in psychological research.” Conciseness is far from 
apparent at many places in the book, and neither the writing nor the logic 
behind it are always clear. This reviewer believes strongly that the student 
will best gain an appreciation of the role of statistics if clear, meaningful, and 
interesting illustrations are given as each technic is introduced. In this 
respect this book is decidedly weak. At least a dozen of the technics or pro¬ 
cedures discussed in the book are not illustrated at all. Some are illustrated 
by ficticious data, others by data of the type: two groups classified according 
to 5 response categories (what groups? what categories?), 13 pairs of scores, 
IQ's of 161 five-year old boys, scores made by 50 college men on the Brown 
spool packer, etc. Here was an excellent opportunity both to enliven the 
text and to point the way to sound research practices by the frequent use of 
data from well designed (and well described) research. 

There is a fuzziness in the discussion at a number of points. The discussion 
of the null hypothesis (p. 223) as it relates to the difference between means 
is not only muddled but at variance with an earlier discussion. The hypoth¬ 
esis of equal means does not imply equality of the population variances nor 
does the hypothesis of equal means and variances lead to the t test. It is 



BOOK REVIEWS 


673 


interesting to note that, at different places in the book, the same hypothesis 
of equal means does not involve equality of variances (p. 65), is said to imply 
equality of variances (p. 223), and is described as involving such equality 
as an assumption (p. 225). The author would have stayed within his pur¬ 
poses if he had not only avoided this confusion of hypotheses and assump¬ 
tions, but had also included tests of the hypothesis of equal means and vari¬ 
ances and of the hypothesis of equal means without the assumption of equal 
variances. 

The spirit of the null hypothesis is well violated in a procedure for testing 
the quantity (pi — gi) — (p 2 —Qi) where the p,- and g* are non-exhaustive pro¬ 
portions in independent samples. Standard errors are obtained for (pf — g*-) 
on the basis of the null hypothesis: p.* —gi =0 (i = 1,2). These standard errors 
form the basis for testing the difference mentioned above, altho the hypoth¬ 
esis that this difference is zero is a far cry from that used in getting the 
standard errors. The concept of the quality of a statistical test or confidence 
limit is missing. Obviously, no technical treatment of this concept is feasible 
but the awareness that we use one test or another in a given situation be¬ 
cause we believe it is less likely to lead to error and that no test is always the 
best one is surely a basic part of the understanding of statistical inference. 
There are a number of minor points about the content of the book which 
may be worth noting. The square root of s* is, as usual, incorrectly described 
as unbiased. The sequence of histograms of (| +1)** with fixed base length is 
asserted to converge to the normal curve; the sequence actually converges to 
a straight line. Sheppard^s and Yate’s corrections are given without hint 
that they can worsen instead of improve matters; this is especially true when 
Yate’s correction is applied, as McNemar recommends, to 2x1 tables. 
The labelling of McNemar’s graphs and the stubs of his tables, are, in several 
cases, atrocious and his treatment of approximate data is on the same level. 
There is no hint that in stratified sampling we might use samples in which 
the proportions are not in the same ratio as in the population, altho in 
general the best design calls for different ratios in sample and population. 
The equating of stratified sampling with the quota method may well confuse 
the student who has understood the quota method to be the one used by 
Gallup. 

Frequent use of elementary algebra is made to derive formulas or to point 
the direction of their derivation. Altho there are those who will object to 
this practice, it is undoubtedly sound and a practice to be encouraged. On 
the other hand, it is difficult to understand why calculus is introduced to 
half-derive the linear regression equations when the elementary algebraic 
derivations of these equations are so simple and straightforward. Inciden¬ 
tally, a little of this algebra applied to the chi-square formula on page 207 
would show that the maximum value of C for 2xn tables is \/l/2 and 
hardly rates classification as “unknown” (p. 182). One can also demonstrate 
that the maximum value of C for any mX citable is the same as for an m'Xm 
or an nXn table, whichever is the smaller. 



674 AMERICAN STATISTICAL ASSOCIATION JOUENAL, DECEMBER 1949 

The discussion of test reliability presents clearly some of the difl&culties 
in estimating reliability, but its advice to base estimates on parallel forms is 
often completely unrealistic. The construction and administration of a 
second test form, if possible, would often represent an extremely inefficient 
use of research facilities. Split-half methods are mentioned but nothing is 
said about the effect of speed on these methods. The useful, reasonably 
sound, and rapidly computed Kuder-Eichardson no. 20 is unmentioned. 
The simple derivation of this formula by Jackson and Ferguson might well 
have been included here. 

The importance of homoscedasticity is stressed thruout the section on 
correlation (and might well have been stressed more often in the section on 
analysis of variance) but McNemar gives no hint as to how one may test a 
set of data for this mouth-filling property. Nor is the warning that many of 
the technics discussed are applicable only to normal data accompanied by 
advice or reference to sources of advice for the non-normal case. Much of 
this reviewer’s pleasure at finding a chapter on covariance analysis is offset 
by the omission of such things as the discriminant function and the equiva¬ 
lent test, more useful, to be sure, than used, and the growing body of 
rapidly computed ^inefficient” statistics which can be of considerable value 
in the early stages of a research program. 

The above remarks are not intended to imply that there is not much which 
is clear and sound in this book. But statistics has reached a point in its 
development where we can rightfully expect that beginning texts have the 
clarity and thoro soundness that we expect and find in a college algebra 
text and do not find in this text. 

Quality Control in Production: A Machine-Shop Manual on the Statistical 
Method of Controlling Product Quality During Manufacture. JET. Bisseb. 
Foreword by Frank Gill. London: Sir Isaac Pitman & Sons, Ltd., 1947. Pp. 
xvii, 181. 21s. 

Review by H. A. Freeman 

Associate Professor of Statistics, Massachusetts Institute of Technology 
Cambridge, Massachusetts 

T he title and subtitle describe the purpose of this book. Its chapter head¬ 
ings indicate its content. They are: 1, Statistical Method and the Qual¬ 
ity Problem; 2, What is Quality Control? What are its Advantages? 3, The 
Basic Principles of Quality Control; 4, Control Charts based on Dimensional 
Measurement; 5, Control Charts based on Counting Defectives; 6, The Or¬ 
ganization of a Quality Control System. There are also an adequate set of ta¬ 
bles, a fair index, a good bibliography, and a foreword by Sir Frank Gill, Past 
President of the Institution of Electrical Engineers. 

It is hard for me to say if this book is better or worse than its many com¬ 
petitors. Its statistical level corresponds to that now standard among qual- 



BOOK REVIEWS 


576 


ity control manuals; it must therefore be rated on its effectiveness in educat¬ 
ing factory personnel. On this, I can only guess; my guess would be that this 
is a superior book, neither as facile or extended as some American manuals, 
but one which shows both that the author has digested his experience in in¬ 
dustrial quality control and that he knows how to write about it. 

One novelty: A discussion of Tippett’s dual control chart is included. In 
this plan, the sampling results of go and of no-go gauges are used separately, 
the difference between the two numbers defective being a good estimate of 
the dimensional average, the sum, a good estimate of dimensional variabil¬ 
ity. 


Sampling Methods in Forestry and Range Management, Second Edition 
F. X, Schumacher (Professor of Forestry, Duke University, Durham, N. C.) 
and R. A. Chapman (United States Forest Service, Washington, D. C.). Duke 
University School of Forestry, Bulletin 7, Revised. Durham, N. C.: the School, 
June 1948. Pp. 222. Paper, $2.00; cloth, $2.50. 

Review by Walter H. Meyer 
Professor of Forest Management 
Yale School of Forestry^ New Haven, Conn. 

T he first edition of this valuable treatise appeared in January 1942. The 
newly published revision attests well to the thoroughness and exactness 
of its predecessor, since scarcely a correction was found necessary in text, 
formulas, tables, or figures. One new chapter has been added, titled “Double 
sampling of individuals in representative sampling of groups; systematic 
computations.” The authors must again be commended for assembling in 
one volume the variety of sampling techniques most appropriate to the com¬ 
plex fields of forestry and range management and for suggesting many meth¬ 
ods of approach to sampling procedures in cases where earlier and current 
“attempts to extract sampling error are more akin to the art of the conjurer 
than to scientific assay.” The authora state their purpose to be that of en¬ 
couraging the practicing forester or range technician to “acquire the art of 
planning—^and executing—suitable sample procedures, such that (1) the real 
error may be assessed unambiguously; and (2) the best estimate is obtaina¬ 
ble ... consistent with the time and funds available for the sampling work.” 
Mathematical derivation and proof is held to a minimum and involved tech¬ 
nical terminology is restricted with the result that the text can be readily 
understood by the forester or range manager with only a preliminary knowl¬ 
edge of the statistical method. If any criticism is to be made of this book, it 
must be toward the restriction indicated by the title, for surely the methods 
advocated will find a far broader application than in the fields of forestry and 
range management alone. 

The first part, consisting of two chapters, deals with the statistical back¬ 
ground of sampling in its simpler aspects. The second part (Chaps, 3-7) deals 
with direct estimates by sampling and the third part (Chaps. 8-13) with in- 



576 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

direct estimates through regression. A short appendix supplies some of the 
essential mathematical background. 

Part II on direct estimates by sampling takes up in order typical sampling 
schemes, starting from the unrestricted random sampling in finite and in¬ 
finite populations, then proceeds to stratified sampling, the simultaneous 
sampling of two or more populations, subsampling and finally the representa¬ 
tive sampling of irregular blocks of known or unknown area. Each of these 
has its appropriate place in the field of forestry. The forester may regret the 
scant attention that is given to systematic sampling, to which he has been 
addicted ever since the beginning of the use of sampling procedures in the 
estimation of wood volumes and which he probably will never discard com¬ 
pletely. He knows that systematic sampling, especially with the double 
stratification that he uses (one for unit areas, the other for forest types) gives 
good results and he is supported in this view by the few known investigations 
involving the checking of 100 per cent enumerations by various types of 
sampling schemes. These have shown that systematic sampling gives results 
of high precision, superior to random or stratified random sampling under at 
least several typical conditions, but a precision which cannot be proven on 
the basis of the data and under current lack of a suitable mechanism of ana¬ 
lyzing systematic samples. Since the publication of the first edition of this 
book, several authors have investigated the case of systematic sampling, in¬ 
cluding Osborne, Yates, Madow and Madow, Finney and others, and 1 ave 
demonstrated its suitability under certain conditions. The advantages of 
systematic sampling in forestry are so many from the administrative and 
financial point of view (and this is probably true also of many other fields) 
that statisticians could well devote more time and effort in developing a suit¬ 
able theory. 

Part III deals with the particularly valuable tool of regression as an aid in 
indirect estimation. Starting with simple linear regression for cases where the 
independent variable is free from sampling error and where it is not, it leads 
to the utility of purposive and mechanical selection of samples in obtaining 
an efficient regression equation. Conditioned regressions and the use of 
weights in such regressions is followed by a short treatment of non-linear 
regressions. Regression in representative sampling is shown to be an effective 
device of correlating ocular estimates with measured values taken on part 
of the general sample area. There follows the new chapter of double sam¬ 
pling in the representative sampling of groups, a topic which is only briefly 
treated, but appears to be the introduction of a rather detailed discussion of 
the systematic computation of normal equations. The latter appears to be the 
real purpose of this new chapter. The final chapter points out certain prac¬ 
tical aspects of sampling, including the definition of objectives; bias; size, 
shape and structure of sampling units; the character of the sample itself; 
sampling intensity; and allocation of costs. A final new section, which does 
not appear in the earlier version, handles the allocation of optimum sample 
size to different strata. This section is called for in view of recent develop- 



BOOK REVIEWS 


577 


ments in aerial photogrammetry, whereby an advance stratification of a for¬ 
ested area can be made on a map showing groups of timber of varying average 
volumes and variances. 


Cybernetics. Norhert Wiener (Professor of Mathematics, Massachusetts Insti¬ 
tute of Technology, Cambridge, Mass.). The Technology Press. New York 
16: John Wiley & Sons, Inc. (440 Fourth Ave.) and Paris: Hermann et Cie, 
1948. Pp. 194. $3.50. 

Review by Sebastian B. Littauer 
Associate Professor of Industrial Engineering 
Columbia Universityy New York 27, N. Y, 

C ybernetics is the science of communication and control in machines and 
men. The title was coined from the Greek x^Pepviirri in order to give 
identity to a new set of concepts developed by the author and his associates 
and further integrated by the author into a unified discipline as presented in 
this work. It encompasses a variety of branches of science in a manner which 
convincingly demonstrates the vitality and fruitfulness of the point of view 
of unification in the sciences. Many problems in neurology, psychopathology 
and communication engineering are shown to have a common core of mean¬ 
ing in which the methods of mathematics, logic and statistics are intrinsic 
and unifying factors. 

The scope of the book is such as to appeal to a wide audience, including not 
only the specialists familiar with one or more of the particular fields dealt 
with, but also the non-spccialist who is seriously concerned with the social 
implications of achievements in cybernetics. For, although the presentation 
is in considerable part mathematical, there is sufficient general discussion to 
acquaint and alarm any reader with the aims and potentialities of cyber¬ 
netics. Of this the author takes cognizance, and in the closing pages of the 
introduction—an intensely interesting document on the circumstances which 
motivated and influenced these researches—^he warns of the possible conse¬ 
quences to man that may follow from the creation of sensitive automata 
which can replace not only the human arm but also, in its simple functions, 
the human brain. In this, in spite of the fact that in the chapter on Newton¬ 
ian and Bergsonian Time he argues the essential similarity between the func¬ 
tioning of the living organism and the cybernetic mechanism, he enjoins upon 
us the moral responsibility not to identify the human being as a commodity 
whose value is determined by the market place. 

There are two fundamental aspects of the present work which largely con¬ 
tributed to its development as a new and unified science. One centers 
around the concept of the message and transmission of information and 
the other stems from the identification of the problems incident to the de¬ 
velopment of this concept in the nervous system with those encountered in 
some mechanisms. The common phenomenon of undesired hunting as a re- 



578 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

suit of excessive feedback is quite analogous to a pathological condition 
known as purpose tremor. The recognition of this relation led to the study 
of certain aspects of neurophysiology as a feed back cycle of information 
circulating from the nervous system to the muscles and reentering the nerv¬ 
ous system through the sense organs. The possibilities inherent in research 
from this point of view were presented by Wiener, Rosenbluth, and Bigelow 
in a paper entitled Behaviour, Purpose and Teleology which appeared in 
Philosophy of Science in 1943. The similarity of the information cycle in man 
and mechanism is suggested as a fruitful means of study of the former by 
experiment with the latter .The promise of this work presents a strong argu¬ 
ment for combined effort by workers in each of these fields, fortified by some 
common ground of knowledge and vocabulary. It is part of the mission of 
cybernetics to encourage this closer tie. 

The message—^to quote—“is a discrete or continuous sequence of measur¬ 
able events distributed in time—precisely what is called a time series by the 
statisticians.” An operator or apparatus—^predicitor—^which follows a mes¬ 
sage finds conflict between the necessity for fidelity of response to smooth 
inputs (slow rates of change) and rapid response to sudden and large changes 
of input. Reconciling this conflict requires an appeal to the statistics of time 
series and the calculus of variations in order to find an operator deemed one 
of optimum prediction in that the mean square error of prediction is mini¬ 
mized. 

The problem of disentangling a message from contaminating noise, or for 
that matter separating two messages, encountered in wave filter design pre¬ 
sents a similar statistical picture. The transmission of information is sta¬ 
tistical in that a prediction operator, in some specified sense optimum, must 
be based on the statistics of the time series of the message to be followed. The 
transmission of information is the transmission of alternatives, where the 
unit of information is that transmitted as a single decision between two 
equally probable alternatives. The amount of information is expressed as the 
negative logarithm to the base two of the number of such decisions that are 
to be made in attaining a particular observation. This it can be seen is the 
negative of the measure of entropy whence information or message is identi¬ 
fied with negative entropy or the degree of organization in a system. The 
system of statistics developed for handling these problems becomes that of 
Imowledge which can be expressed in a binary system. The results of these 
methods in determining prediction operators have been applied to computing 
machines, wave filters, simulated nervous system information cycles and the 
like, and found to be practically effective. 

The mathematical developments are given in the chapters: Groups and 
Statistical Mechanics; Time Series, Information and Communication; Feed- 
Back and Oscillation. Familiarity with Lebesgue integration, probability 
theory and Fourier analysis, as well as with electrical circuit theory is help¬ 
ful in reading these chapters. Nevertheless, their essence is remarkably well 
presented in verbal form. The statistics presented are based on knowledge of 



BOOK REVIEWS 


679 


the complete past of a time series; prediction based on sampling the history 
of a time series has but recently begun to be developed by a number of workers. 

Implications and consequences of the theory developed are numerous. 
One possibility in quantum mechanics is quite promising; for, granting a 
hypothesis on the state of cosmic noise and degrees of freedom of the system 
the author's theory of information implementing the concept of negative 
entropy may be the proper means for deriving the Schroedinger equations 
from the Maxwell equations. It will be interesting to observe developments 
in this direction. 

A striking feature of this book is the highly provocative nature of some 
of the author's speculations. For example, anyone familiar with problems 
in psychiatry and related attempts at practical therapy will be strongly 
persuaded by the cybernetic explanation offered for the nature of a class of 
disturbance and for the occasional effectiveness of its current treatment. 
Certainly within the field itself no explanation has met acceptance, and 
the stimulating suggestions offered by the author are to be welcomed for 
their promise of effective and practical progress. To the largest body of 
readers of cybernetics, and to the readers of this journal in particular, the 
chapter, Information, Language and Society, speaks most directly. The 
author is not optimistic about the immediate usefulness of the methods of 
cybernetics in the resolution of the problems of society, as are some of his 
colleagues in the fields of anthropology and sociology who have been urging 
him to take off in their directions. But he does show quite simply and directly 
in terms of the ideas developed in this book that most of the apologetics for 
the state of society in which our literature abounds, can in the very nature 
of things be but abortive and sterile efforts. In a framework within which 
there is no statistical control and in a time scale in which there can be only 
short runs, there can be no science of society, if science means prediction. 
The direction for fruitful work in the study of society is definitely pointed. 
If the influence of this book, and in particular of the last chapter, is such as 
to bring about a more realistic consideration of the problems of human sur¬ 
vival, it will have done well. 

Cybernetics had to be written—^not only for the formal science which it 
presents—but also for the stimulation and enlightenment it offers to the 
non-scientist who in his turn does influence the direction of scientific in¬ 
quiry. This is a unique book in its content—^but it is also exceptional for its 
style, which manifests at times an expressive grace and incisiveness which 
cannot but compel the enthusiastic attention of the reader. Since it is reason¬ 
ably certain that there will be further editions, a few improvements will be 
welcomed, such as correction of a number of misprints in the mathematical 
and verbal text, and inclusion of an index, as well as more references to the 
literature. In its depth, breadth and unity. Cybernetics is a powerful and 
important work; the author is to be congratulated for bringing his scientific 
knowledge and insight to bear upon problems of profound concern to a wide 
audience. 



580 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

Sampling Methods for Censuses and Surveys. Frank Yates (Head of the De¬ 
partment of Statistics, Rothamsted Experimental Station, Harpenden, Herts, 
England). London: Charles Griffin & Co., Ltd., 1949. Pp. xiv, 318. 245. (New 
York: Hafner Publishing Co., Ltd. S6.00.) 

Review by W. Edwards Deming 
Adviser in Sampling, Bureau of Budget 
Washington 

H ebe at last is a real book on sampling. It is a pleasure to review such 
an outstanding contribution. In the hands of an expert a probability 
sample can be designed so that it meets pretty closely some prescribed 
tolerance of sampling error (such as ±2%, ±15%), with a desired proba¬ 
bility, and at the lowest possible cost per unit amount of information. 
Moreover, regardless of any assumptions that went into the planning, any 
estimate made from the sample is accompanied by an irrefutable index of 
precision. All this is made clear. In modern sampling practice it is also 
considered desirable to procure ancillary information concerniug various 
alternative procedures, such as a different version of the questionnaire, a 
different plan of hiring, training, or interviewing, or supervising. This 
ancillary information illuminates the results obtained from the main survey, 
and is obtained by carr 3 dng out simultaneous or supplementary samples in 
addition to the main survey. Such ancillary information is particularly 
desirable for large surveys and complete censuses, as is coming to be the 
practice. 

A sufficient background of mathematical theory and practical experience 
in dealing with human populations, farms, agricultural production, inven¬ 
tories, business activity, and numerous physical materials, have now ac¬ 
cumulated by which probability samples may now be designed in a great 
variety of materials and problems. 

The techniques of sampling have advanced rapidly during the past few 
years, but the requisite knowledge and experience have been largely confined 
to an inner circle of masters who have trained numerous apprentices by the 
spoken word. This is an unhealthy state of affairs, because the demand 
for competent statisticians has been running further and further ahead of 
the supply. No book can ever replace the privilege of working with a master, 
but Yates, indeed one of the masters, has put about as much inspiration and 
guidance into a book as is possible. Unfortunately, there is no royal and easy 
road to the top, but statistical teaching centres will now have an authorita¬ 
tive book to use for studies in sampling. The book will be useful for instruc¬ 
tion in internships, and the private study of innumerable struggling, 
mathematically-inclined statisticians in government offices throughout 
the world will now be much more effective. 

Other study and teaching aids in advanced theories of sampling do eixst; 
for example, Thionet's M4thodes Statistiques Modernes des Administrations 



BOOK REVIEWS 


581 


FSMrales aux Etats-Unis and A Chapter in Population Sampling produced 
by the sampling staff of the Bureau of the Census in Washington, 1947; 
also Mahalanobis’s unsurpassed article entitled ^‘On Large Scale Sample 
Surveys,” in the Phil. Tram. Royal Soc., VoL 231B, 1944; and Walter 
Hendiicks's Mathematics of Sampling (Virginia Agricultural Experiment 
Station, 1948), but these productions have not the scope of Yates’s book. 

The very title of the book is a clever statement of the attainment of 
modern methods of sampling- The title suggests, as is true, that sampling 
methods are now used not only for occasional or periodic special surveys 
of various kinds, but actually for replacing, broadening, and calibrating the 
data heretofore reserved for complete censuses of population, agriculture, 
commerce, etc. 

The fact is that sampling is the modern approach toward all kinds of data. 
Sampling is the art and science of acquiring whatever information is de¬ 
sired, at the lowest possible cost, and with an objective index of precision. 
For detailed and precise tables by small areas, the most economical sample 
may of course turn out to be a complete census. 

A glance at the table of contents is sufficient to show the extremely broad 
coverage that the author has included. Space will not permit a list of the 
topics treated, but they include a discussion of the ordinary biases that are 
encountered and the ways in which modern sampling design avoids and 
corrects these biases. The book begins at the beginning—^the definition of 
the sample unit or units that are to be used, and the building or acquisition 
of the frame. In accordance with the recommendations of the U. N. Sub- 
Commission on Statistical Sampling, the term frame denotes a clear and 
unambiguous listing or mapping of the sampling units. In multi-stage 
sampling a frame will be required for every sampling unit that falls into the 
sample, in preparation for the next stage of sampling. The author discusses 
the efficiencies of various ways of drawing various kinds and sizes of sampling 
units, and of calculating the estimates. 

In modern sampling design, the formula and procedure by which the esti¬ 
mates are to be prepared are as important as the procedures of selection; in 
fact, the two together constitute the sample design. Yates gives a splendid 
treatment of various methods of computation, along with excellent discus¬ 
sions of the amount of labor that is saved or entailed through the use of 
constant and variable weighting factors. He gives a well-balanced treatment 
of ratio-estimates, two-stage sampling, stratified sampling, variable sampling 
fractions, for some of which he was required to develop new theory to fill 
in the gaps. The estimate of the gains arising from multiple stratification 
(Sec. 8.4), and of using a plan of partial replacement of sampling units in 
successive samples (Sec. 88) will illustrate the breadth of techniques that are 
covered. A simple cost-function is treated on pages 283ff. In teaching the 
book, it might be well to make clear to the students that there is a cost- 
function associated with every sample design, and that sample-design in the 



582 AMESICAN STATISTICAL ASSOCIATION JOUBNAL; DEOEMBEB 1949 

modem sense means using whatever plan will produce the most information 
per dollar. 

The illustrations are mostly drawn from the author's wide experience in 
suiveys of agricultural and wood lands, but this is excellent, because the 
fundamental piinciples are the same regardless of the material sampled. 

The reviewer is in full agreement with the statement on page 59 that where 
a full census is compulsory, a sample should also be compulsory. 

The well-chosen bibliography at the back, divided into convenient sec¬ 
tions, will be very useful. 

This review can only conclude with thanks and congratulations to the 
author: his book heralds a turning-point in statistical history. 



JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


CONTENTS OF VOLDME 44 
The 108th Annual Meeting 

Minutes of the Annual Business Meeting. 297 

Minutes of the Meeting of the Commission on Statis¬ 
tical Standards and Organization. 305 

Articles. Ij 157,335,463 

Book Reviews. 132,311,444,567 

Letters About Books. 460 

Index to Volume 44, 1949 

Articles, by Author. 587 

Book Reviews, by Author.588 

Reports and Official Notices 

Report of the Board of Directors. 300 

Report of the Secretary. 303 

Report of the Nominating Committee. 304 

Report of the Committee on Fellows. 304 

Report of the Treasurer. 307 

Report of the Auditors. 308 















INDEX TO VOLUME 44: 1949 

ARTICLES 

Alchian, Armen A., Note on Some Errors in *^The Evidence of Periodicity 

in Short Time Series” . .659 

Bebeson, Joseph, Minimum and Maximum Likelihood Solution in 

Terms of a Linear Transfermf with Particular Reference to Bio-Assay . 273 

Bebbettoni, Julio N., and Grbb, Donald J., AOQL Single Sampling 

Plans from a Single Chgrt and Table .... .62 

Bbennan, J. F., Evaluation of Parameters in the Gompertz and Makeham 

Equations .116 

Chandra Sbkab, C., and Dbmino, W. Edwards, On a Method of Estimat¬ 
ing Birth and Death Rates and the Extent of Registration .... 101 

Chevbt, Gabriel, Control of a General Census by Means of an Area Sam^ 

pling Method . 373 

Cochrane, D., and Orcutt, G. I£., Application of Least Squares Regression 

to Relationships Containing Auto-Correlated Error Terms .... 32 

Cochrane, Donald, and Orcutt, Gut IL, A Sampling Study of the Merits 
of Autoregressive and Reduced Form Transformations in Regression 

Analysis .356 

Cohen, A. C., Jb., On Estimating the Mean and Standard Deviation of 

Truncated Normal Distributions . 618 

Deming, W. Edwards, and Chandra Se&ar, C., On a Method of Estimating 

Birth and Death Rates and the Extent of Registration 101 

Dodd, Stuart C., On Measuring Languages . .77 

Freedman, Ronald, and Hawley, Amos, U., Unemployment and Migror- 

tion in the Depression (lOSG-JOSd) .260 

Geibinger, Hilda, On Some Mathematical Problems Arising in the De¬ 
velopment of Mendelian Genetics .626 

Greb, Donald J., and Berrettoni, Julio N., AOQL Single Sampling 

Plans from a Single Chart and Table .62 

Hawlet, Amos II., and Freedman, Ronald, Unemployment and Migra¬ 
tion in the Depression (1030-J9Sd) .260 

Jessen, Raymond J., Some Inadequacies of the Federal Cemuses of Agricul¬ 
ture .279 

Kish, Leslie, A Procedure for Objective Respondent Selection Within the 

Household . 380 

Enowleb, Lloyd A., and Olds, Edwin G., Teaching Statistical Quality 

Control for Town and Govm .213 

Euznbts, Simon, Wesley Clair Mitchell^ An Appreciation . 126 

Lawrence, Norman, and Shryook, Henry S., Jb., The Current Status of 

State and Local Population Estimates in the Census Bureau, . . . 157 

Lester, A. M., The Edge Marking of Statistical Cards .293 

Metropolis, Nicholas, and Ulam, S., The Monte Carlo Method . . . 335 

Morrison, Nathan, By-Product Data and Forecasting in Unemployrmnt 

Insurance . 397 

Moser, C. A., The Use of Samgpling in Great Britain .231 


687 



















588 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER ldi9 


Mostblleb, Fbedebioe, and Tuket, John W., The Uses and Usefulness of 

Binomial Probability Paper .174 

Myebb, Robert J., Beneficiary Siaiistics Under the Old-Age and Survivors 
Insurance Program and Some Possible Demographic Studies Based on 

these Data .388 

Noetheb, Gottfbied E., Confidence Limits in the Non-Parametric Case . 89 

Novice, David, and Steineb, Geobge A., The War Production Board's 

Statistical Reporting Experience^ V and VI .413 

Olds, Edward B., The City Block as a Unit for Recording and Analyzing 

Urban Data .485 

Olds, Edwin G., and Knowler, Lloyd A., Teaching Statisticdl Qiuility 

Control for Town and Gown .213 

Obcutt, G. H., and Cochrane, D., Application of Least Squares Regression 

to Relationships Containing Auto-Correlated Error Terms .... 32 

Obcutt, Guy H., and Cochrane, Donald. A Sampling Study of the Merits 
of Autoregressive and Reduced Form Transformations in Regression 

Analysis .356 

PoLiTz, Alfred, and Simmons, Willard, An Attempt to Get the “Not at 

Homes'* into the Sample loUhout Callbacks . 9 

Rasob, Eugene A., The Fitting of Logistic Ctarves by Means of a Nomograph 548 
Rice, Stuart A., William Lane Austin (1871-1949); James Clyde Capt 

(1888-1949) . 565 

Shbyooe, Henry S., Jb., and Lawrence, Nobman, The Current Status of 

State and Local Population Estimates in the Census Bureau, . . . 157 

Simmons, Willard, and Politz, Alfred, An Attempt to Get the “Not at 

Homes” into the Sample without Callbacks . 9 

Snbdegor, George W., On a Unique Feature of Statistics . 1 

Steiner, George A., and Novice:, David, The War Production Board's 

Statistical Reporting Experience, V and VI .413 

Tukey, John W., and Mosteller, Frederick, The Uses and Usefulness of 

Binomial Probability Paper .174 

Ulam S., and Metropolis, Nicholas, The Monte Carlo Method . . . 335 

Wallis, W. Allen, Stcdistics of the Kinsey Report .463 

Walsh, John E., On the “Information” Lost by Using a UTest When the 

Population Variance is Known .122 

Walsh, John E., Applicodions of Some Significance Tests for the Median 

Which are Valid Under Very General Conditions .342 

Walsh, John E., On the Best Choice of Sample Sizes for a t-Test when the 

Ratio of Variances is Known .554 

Watkins, I^lph J., Statistical Requirements for Economic Mobilization . 406 

WiLLCOK, Walter F., Conrad Alexander Verrijn Stuart (1866-1948) . . 295 

WooFTER, T. J., The Relation of the Net Reproduction Rate to Other Fer- 

' Measures .501 


BOOK REVIEWS 

Abbott, J. C., and Benac, T. J., Principles of Counting and Probability • 

.Herbert Solomon 133 

Anderson, J. L., and Dow, J. B., Actuarial Statistics: VoL II, Construction 

of Mortality and Other Tables .T. N. E. Greville 311 





















INDEX TO VOLUME 44 689 

Bratt, Elmer Clark, Business Cycles and Forecasting, Third Edition . 

.William A. Sptjrb 134 

British Standards Institution, Fraction-Defective Charts . 

.Albert H. Bowkbr 132 

CBAvm,F.STi5Am,ExpeHrn^nialDesig7hs in Sociological Research . . . 

.Margaret Jarman Hagood 312 

Charlier, C. Y. L., Elements of Mathematical Statistics Including Table of 

Poisson’s Function by L. v, Borkiewicz . . Burton H. Camp 313 

Churchman, C. West, Theory of Experimental Inference John W. Tukbt 136 
Croxton, Frederick E., and Cowden, Dudley J., Practical Business Sta¬ 
tistics, Second Edition .Alfred Cahen 444 

Dahlberg, Gunnar, Mathematical Methods for Population Oeneites . . 

.Howard Lbvbnb 447 

David, F. N., Probability Theory for Statistical Methods . John W. Tukey 567 
Douglass, Raymond D., and Adams, Douglas P., Elements of Nomog- 

raphy .Joseph Zubin 315 

Emmens, C. W., Principles of Biological Assay . . Lila F. Knudsbn 448 

Enrick, Norbert L., Quality Control: A Manual of Qudlity Control Pro¬ 
cedure Based Upon Scientific Principles and Simplified for Practical 
Application in Various Types of Manufacturing Plants J. H. Curtiss 139 
Gallup, George, A Guide to Public Opinion Polls, Second Edition . . 

.Robert Cobb Myers 315 

Grbenshields, Bruce D., Schapiro, Donald, and Ericksbn, Elroy L., 

Traffic Performance at Urban Slreet Intersections Harry G. Romig 142 

.Henry K. Evans 451 

Hald, a.. The Decomposition of a Series of Observations Composed of a 

Trend, a Periodic Movement, and a Stochastic Variable . 

.D. B. DeLury and Boyd Harshbarobr 317 

Hendricks, Walter A., Mathematics of Sampling . . T. A. Bancroft 144 

Hbrdan, G., Quality Control by SUxtistical Methods . . . Paul Peach 569 

Hill, A. Bradford, Principles of Medical Statistics, Fourth Edition . . 

.Margaret Martin 146 

Jeffreys, Harold, Theory of Probability, Second Edition . 

. BDsrbbrt Robbins 453 

Johnson, Palmer O., Statistical Methods in Research . 

. Frederick Mosteller 570 

Kendall, Maurice G., Rank Correlation . . . E. J. G. Pitman 454 

Kennedy, Clifford W., Quality Control Methods . Sebastian B. Littaubr 320 

.Charles R. Scott, Jb. 322 

Kerrich, j. E., An Experimental Introduction to the Theory of Probability . 

. . J. F. Kennby 147 

Mainland^ Donald, Statistical Methods in Medical Research: I, Qualitative 

Statistics {Enumeration Data) ....... John W. Fertig 148 

MAT:fiRN, Bertil, Metoder at Uppskatta Noggrannheien vid lAnje- och 
Provytetaxervng. [Methods of Estimating the Accuracy of Line and Sample 

Plot Surveys] .T. W. Anderson 323 

MgNemab, Quinn, Psychological Statistics . . .Edmund Churchill 572 

von Mises, Richabd, Lecture Notes on Mathemaiicdl Theory of Probability 
and Statistica .Benjamin Epstein 326 























590 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1949 

Pansb, V. G., Report on the Scheme for th$ Improvement of Agricultural Sta¬ 
tistics .S, Lee Chump 455 

Rashevskt, N., Mathematical Theory of Human Relations: An Approach to 

a Mathematical Biology of Social Phenomenon . Fredebice Mostelleb 150 
Ricker, William E., Methods of Estimating Vital Statistics of Fish Popular^ 

lions .Charles M. Mottley 456 

Rissek, H., Quality Control in Production: A Machine-Shop Manual onthe 
Statistical Method of Controlling Product Quality During Manufacture . 

.H. A. Freeman 574 

ScHUMACHBB, F. X., and Chapman, R. A., Sampling Methods in Forestry 

and Range Management, Second Edition . . . Walter H. Meyer 575 

Sumner, W. L., Statistics in School .F. G. Cornell 327 

VAN Uvbn M. J., Mathematical Treatment of the Results of Agricultural and 

Other Experiments, Second Edition .G. A. Baker 329 

Wiener, Nobbert, Cyhemetics .... Sebastian B. Littauer 577 
Wilks, S. S., Elementary Statistical Analysis . . . T. A. Bancroft 458 

WiNTNER, Aubel, The FourieT Transforms of Probability Distributions 

.J. WoLPowiTZ 330 

Wold, £[erman, Random Normal Deviates: 26,000 Items Compiled from 
Tract No. XXIV (M. Q. Kendall and B. Babingion Smith's Tables of 

Random Sampling Numbers) .H. Burke Horton 331 

Sampling Methods for Censuses and Surveys .... 

.W. Edwards Deminq 580 

Zbisel, Hans, Say It With Figures . Gregor Sebba 332 














